Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate the percentile rank of a value in a multi-dimensional array along an axis

I have a 3D dimensional array.

>>> M2 = np.arange(24).reshape((4, 3, 2))
>>> print(M2)
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5]],

       [[ 6,  7],
        [ 8,  9],
        [10, 11]],

       [[12, 13],
        [14, 15],
        [16, 17]],

       [[18, 19],
        [20, 21],
        [22, 23]]])

I would like to calculate the percentile rank of a particular value along axis = 0.

E.g. if the value = 4, the output is expected to be:

[[0.25, 0.25],
 [0.25, 0.25],
 [0.25, 0.0]]

where the 0.25 at [0][0] is the percentile rank of 4 in [0, 6, 12, 18], etc.

If the value = 2.5, the output is expected to be:

[[0.25, 0.25],
 [0.25, 0.0],
 [0.0, 0.0]]

I was thinking using scipy.stats.percentileofscore but this one does not seem to work with multi-dimensional array.

---------------------------- Edit ---------------------------

Enlightened by Evan's comment. I came up with a solution using scipy.stats.percentileofscore.

percentile_rank_lst = []
for p in range(M2.shape[1]):
    for k in range(M2.shape[2]):
        M2_ = M2[:, p, k]
        percentile_rank = (stats.percentileofscore(M2_, 4)) / 100
        percentile_rank_lst.append(percentile_rank)

percentile_rank_nparr = np.array(percentile_rank_lst).reshape(M2.shape[1], M2.shape[2])
print(percentile_rank_nparr)

The output was:

array([[0.25, 0.25],
 [0.25, 0.25],
 [0.25, 0.0]])
like image 311
alextc Avatar asked Oct 15 '22 08:10

alextc


1 Answers

I think this does the job:

def get_percentile(val, M=M2, axis=0):
    return (M > val).argmax(axis)/ M.shape[axis]

get_percentile(4)
#array([[0.25, 0.25],
#       [0.25, 0.25],
#       [0.25, 0.  ]])

get_percentile(2.5)
#array([[0.25, 0.25],
#       [0.25, 0.  ],
#       [0.  , 0.  ]])
like image 190
Quang Hoang Avatar answered Oct 19 '22 00:10

Quang Hoang