Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: Get row by median value

Tags:

python

pandas

I'm trying to get the row of the median value for a column.

I'm using data.median() to get the median value for 'column'.

id                 30444.5
someProperty           3.0
numberOfItems          0.0
column                70.0

And data.median()['column'] is subsequently:

data.median()['performance']
>>> 70.0

How can get the row or index of the median value? Is there anything similar to idxmax / idxmin?

I tried filtering but it's not reliable in cases multiple rows have the same value.

Thanks!

like image 410
BarakChamo Avatar asked Jun 09 '14 16:06

BarakChamo


1 Answers

You can use rank and idxmin and apply it to each column:

import numpy as np
import pandas as pd


def get_median_index(d):
    ranks = d.rank(pct=True)
    close_to_median = abs(ranks - 0.5)
    return close_to_median.idxmin()
df = pd.DataFrame(np.random.randn(13, 4))
df
    0           1           2           3
0   0.919681    -0.934712   1.636177    -1.241359
1   -1.198866   1.168437    1.044017    -2.487849
2   1.159440    -1.764668   -0.470982   1.173863
3   -0.055529   0.406662    0.272882    -0.318382
4   -0.632588   0.451147    -0.181522   -0.145296
5   1.180336    -0.768991   0.708926    -1.023846
6   -0.059708   0.605231    1.102273    1.201167
7   0.017064    -0.091870   0.256800    -0.219130
8   -0.333725   -0.170327   -1.725664   -0.295963
9   0.802023    0.163209    1.853383    -0.122511
10  0.650980    -0.386218   -0.170424   1.569529
11  0.678288    -0.006816   0.388679    -0.117963
12  1.640222    1.608097    1.779814    1.028625
df.apply(get_median_index, 0)
0    7
1    7
2    3
3    4
like image 87
j sad Avatar answered Oct 10 '22 12:10

j sad