Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groupby returning full row for max occurs

Tags:

python

pandas

How to get full row of data for groupby relsult?

df
   a   b   c  d   e
0  a  25  12  1  20
1  a  15   1  1   1
2  b  12   1  1   1
3  n  25   2  3   3

In [4]: df = pd.read_clipboard()

In [5]: df.groupby('a')['b'].max()
Out[5]: 
a
a    25
b    12
n    25
Name: b, dtype: int64

How the get the full row?

a   b   c  d   e
a  25  12  1  20
b  12   1  1   1
n  25   2  3   3

I tried filtering but df[df.e == df.groupby('a')['b'].max()] but size is different :(

Original data:

0          1       2        3     4        5     6      7       8    9   
EVE00101  Trial  DRY RUN  PASS  1610071  1610071  Y  20140808  NaN  29   

10        11                12           13                 14  
FF1  ./ff1.sh  Event Validation  Hive Tables  2015-11-30 9:40:34 

Groupby([1,7])[14].max() gives me the result but in grouped series as 1 and 7 as index I wanted the corresponding columns. It is 15,000 row data and provided 1 row of sample

like image 504
WoodChopper Avatar asked Dec 11 '15 10:12

WoodChopper


1 Answers

You can use argmax() :

In [287]: df.groupby('a', as_index=False).apply(lambda x: x.loc[x.b.argmax(),])
Out[287]:
   a   b   c  d   e
0  a  25  12  1  20
1  b  12   1  1   1
2  n  25   2  3   3

This way it works even if b is not the biggest one.

like image 64
Colonel Beauvel Avatar answered Oct 20 '22 03:10

Colonel Beauvel