Pandas groupby sequential values

Question

I have no idea how to call this operation, so I couldn't really google anything, but here's what I'm trying to do:

I have this dataframe:

df = pd.DataFrame({"name": ["A", "B", "B", "B", "A", "A", "B"], "value":[3, 1, 2, 0, 5, 2, 3]})
df
  name  value
0    A      3
1    B      1
2    B      2
3    B      0
4    A      5
5    A      2
6    B      3

And I want to group it on df.name and apply a max function on df.values but only if the names are in sequence. So my desired result is as follows:

df.groupby_sequence("name")["value"].agg(max)
  name  value
0    A      3
1    B      2
2    A      5
3    B      3

Any clue how to do this?

sacuL · Accepted Answer

Using pandas, you can groupby when the name changes from row to row, using (df.name!=df.name.shift()).cumsum(), which essentially groups together consecutive names:

>>> df.groupby((df.name!=df.name.shift()).cumsum()).max().reset_index(drop=True)
  name  value
0    A      3
1    B      2
2    A      5
3    B      3

Dani Mesejo · Answer

Not exactly a pandas solution, but you could use groupby from itertools:

from operator import itemgetter

import pandas as pd
from itertools import groupby

df = pd.DataFrame({"name": ["A", "B", "B", "B", "A", "A", "B"], "value":[3, 1, 2, 0, 5, 2, 3]})

result = [max(group, key=itemgetter(1)) for k, group in groupby(zip(df.name, df.value), key=itemgetter(0))]

print(result)

Output

[('A', 3), ('B', 2), ('A', 5), ('B', 3)]

Pandas groupby sequential values

Tags:

python

pandas

pandas-groupby

Jurgy

2 Answers

sacuL

Dani Mesejo

Recent Activity

Donate For Us

Pandas groupby sequential values

Tags:

python

pandas

pandas-groupby

Jurgy

2 Answers

sacuL

Dani Mesejo

Related questions

Recent Activity

Donate For Us