Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Python: Divide values in cell by max in each column




Is this an efficient or correct way to divide every cell in each column by the maximum value in that column within a table? Is there a better implementation (if this is correct)? Note: All values >= 0

new_data = [];
for row in np.transpose(data)[1::]: #from 1 till end
    for elements in row:
        if sum(elements) != 0:
new_data = np.transpose(new_data);


id col1 col2 col3 col4
A   2    1    4    0
B   3    8    2    0
C   2    3    0    0
D   5    5    3    0
E   6    3    3    0


id col1 col2 col3 col4
A  1/3  1/8  1     0  
B  1/2  1    1/2   0
C  1/3  3/8  0     0
D  5/6  5/8  3/4   0
E  1    3/8  3/4   0
like image 345
Black Avatar asked Feb 19 '14 03:02


1 Answers

How do you handle 0? Like the last column? It should be nan in theory. (sum(elements) != 0, what if it is -2 -1 0 1 2? That should be result in -1 -0.5 0 0.5 1, right?)

In [138]:

A*1./np.max(A, axis=0)
array([[ 0.33333333,  0.125     ,  1.        ,         nan],
       [ 0.5       ,  1.        ,  0.5       ,         nan],
       [ 0.33333333,  0.375     ,  0.        ,         nan],
       [ 0.83333333,  0.625     ,  0.75      ,         nan],
       [ 1.        ,  0.375     ,  0.75      ,         nan]])

We can leave the last column as it is.

In [141]:

np.where(np.max(A, axis=0)==0, A, A*1./np.max(A, axis=0))
array([[ 0.33333333,  0.125     ,  1.        ,  0.        ],
       [ 0.5       ,  1.        ,  0.5       ,  0.        ],
       [ 0.33333333,  0.375     ,  0.        ,  0.        ],
       [ 0.83333333,  0.625     ,  0.75      ,  0.        ],
       [ 1.        ,  0.375     ,  0.75      ,  0.        ]])

The correct way of doing it with a loop is:

for row in A.T:
    if max(row)>0:
        new_data.append([item*1./max(row) for item in row])
like image 199
CT Zhu Avatar answered Oct 03 '22 20:10

CT Zhu