I have a 3D dimensional array.
>>> M2 = np.arange(24).reshape((4, 3, 2))
>>> print(M2)
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]])
I would like to calculate the percentile rank of a particular value along axis = 0.
E.g. if the value = 4, the output is expected to be:
[[0.25, 0.25],
[0.25, 0.25],
[0.25, 0.0]]
where the 0.25 at [0][0] is the percentile rank of 4 in [0, 6, 12, 18], etc.
If the value = 2.5, the output is expected to be:
[[0.25, 0.25],
[0.25, 0.0],
[0.0, 0.0]]
I was thinking using scipy.stats.percentileofscore
but this one does not seem to work with multi-dimensional array.
---------------------------- Edit ---------------------------
Enlightened by Evan's comment. I came up with a solution using scipy.stats.percentileofscore
.
percentile_rank_lst = []
for p in range(M2.shape[1]):
for k in range(M2.shape[2]):
M2_ = M2[:, p, k]
percentile_rank = (stats.percentileofscore(M2_, 4)) / 100
percentile_rank_lst.append(percentile_rank)
percentile_rank_nparr = np.array(percentile_rank_lst).reshape(M2.shape[1], M2.shape[2])
print(percentile_rank_nparr)
The output was:
array([[0.25, 0.25],
[0.25, 0.25],
[0.25, 0.0]])
I think this does the job:
def get_percentile(val, M=M2, axis=0):
return (M > val).argmax(axis)/ M.shape[axis]
get_percentile(4)
#array([[0.25, 0.25],
# [0.25, 0.25],
# [0.25, 0. ]])
get_percentile(2.5)
#array([[0.25, 0.25],
# [0.25, 0. ],
# [0. , 0. ]])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With