A similar question is asked here: Python : Getting the Row which has the max value in groups using groupby
However, I just need one record per group even if there are more than one record with maximum value in that group.
In the example below, I need one record for "s2". For me it doesn't matter which one.
>>> df = DataFrame({'Sp':['a','b','c','d','e','f'], 'Mt':['s1', 's1', 's2','s2','s2','s3'], 'Value':[1,2,3,4,5,6], 'count':[3,2,5,10,10,6]})
>>> df
Mt Sp Value count
0 s1 a 1 3
1 s1 b 2 2
2 s2 c 3 5
3 s2 d 4 10
4 s2 e 5 10
5 s3 f 6 6
>>> idx = df.groupby(['Mt'])['count'].transform(max) == df['count']
>>> df[idx]
Mt Sp Value count
0 s1 a 1 3
3 s2 d 4 10
4 s2 e 5 10
5 s3 f 6 6
>>>
Find Maximum Element in Pandas DataFrame's RowIf the axis equals to 0, the max() method will find the max element of each column. On the other hand, if the axis equals to 1, the max() will find the max element of each row.
You can use first
In [14]: df.groupby('Mt').first()
Out[14]:
Sp Value count
Mt
s1 a 1 3
s2 c 3 5
s3 f 6 6
Set as_index=False
to achieve your goal
In [28]: df.groupby('Mt', as_index=False).first()
Out[28]:
Mt Sp Value count
0 s1 a 1 3
1 s2 c 3 5
2 s3 f 6 6
Sorry for misunderstanding what you mean. You can sort it first if you want the one with max count in a group
In [196]: df.sort('count', ascending=False).groupby('Mt', as_index=False).first()
Out[196]:
Mt Sp Value count
0 s1 a 1 3
1 s2 e 5 10
2 s3 f 6 6
To get first occurence of maximum count
you can use pandas.DataFrame.idxmax() function:
>>> df.iloc[df.groupby(['Mt']).apply(lambda x: x['count'].idxmax())]
Mt Sp Value count
0 s1 a 1 3
3 s2 d 4 10
5 s3 f 6 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With