Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Normalize numpy array columns in python

People also ask

How do I normalize a NumPy array by column?

Normalize Numpy Array By Columns You can use the axis=0 in the normalize function to normalize the NumPy array into a unit vector by columns.

How do you normalize a NumPy array in Python?

To normalize a 2D-Array or matrix we need NumPy library. For matrix, general normalization is using The Euclidean norm or Frobenius norm. Here, v is the matrix and |v| is the determinant or also called The Euclidean norm. v-cap is the normalized matrix.

How do I standardize data in NumPy array?

You need to specify that you would like to normalize for each column (np. mean(X, axis=0) and np. std(X, axis=0).


If I understand correctly, what you want to do is divide by the maximum value in each column. You can do this easily using broadcasting.

Starting with your example array:

import numpy as np

x = np.array([[1000,  10,   0.5],
              [ 765,   5,  0.35],
              [ 800,   7,  0.09]])

x_normed = x / x.max(axis=0)

print(x_normed)
# [[ 1.     1.     1.   ]
#  [ 0.765  0.5    0.7  ]
#  [ 0.8    0.7    0.18 ]]

x.max(0) takes the maximum over the 0th dimension (i.e. rows). This gives you a vector of size (ncols,) containing the maximum value in each column. You can then divide x by this vector in order to normalize your values such that the maximum value in each column will be scaled to 1.


If x contains negative values you would need to subtract the minimum first:

x_normed = (x - x.min(0)) / x.ptp(0)

Here, x.ptp(0) returns the "peak-to-peak" (i.e. the range, max - min) along axis 0. This normalization also guarantees that the minimum value in each column will be 0.


You can use sklearn.preprocessing:

from sklearn.preprocessing import normalize
data = np.array([
    [1000, 10, 0.5],
    [765, 5, 0.35],
    [800, 7, 0.09], ])
data = normalize(data, axis=0, norm='max')
print(data)
>>[[ 1.     1.     1.   ]
[ 0.765  0.5    0.7  ]
[ 0.8    0.7    0.18 ]]