This is my DataFrame:
import pandas as pd
df = pd.DataFrame(
{
'a': [10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 10, 22],
'b': [1, 1, 1, -1, -1, -1, -1, 2, 2, 2, 2, -1, -1, -1, -1],
'c': [25, 25, 25, 45, 45, 45, 45, 65, 65, 65, 65, 40, 40, 30, 30]
}
)
The expected output: Grouping df
by c
and a condition:
a b c
0 10 1 25
1 15 1 25
2 20 1 25
3 25 -1 45
4 30 -1 45
5 35 -1 45
6 40 -1 45
11 65 -1 40
12 70 -1 40
The process is as follows:
a) Selecting the group that all of the b
values is 1
. In my data and this df
there is only one group with this condition.
b) Selecting first two groups (from top of df
) that all of their b
values are -1.
For example:
a) Group 25 is selected.
b) There are three groups with this condition. First two groups are: Group 45 and 40.
Note that there is a possibility in my data that there are no groups that has a
or b
condition. If that is the case, returning whatever matches the criteria is fine. For example the output could be only one group or no groups at all.
The groups that I want are shown below:
These are my attempts that got very close:
df1 = df.groupby('c').filter(lambda g: g.b.eq(1).all())
gb = df.groupby('c')
new_gb = pd.concat([gb.get_group(group) for i, group in enumerate(gb.groups) if i < 2])
You can use custom masks for boolean indexing:
# identify groups with all 1
m1 = df['b'].eq(1).groupby(df['c']).transform('all')
# identify groups with all -1
m2 = df['b'].eq(-1).groupby(df['c']).transform('all')
# keep rows of first 2 groups with all -1
m3 = df['c'].isin(df.loc[m2, 'c'].unique()[:2])
# select m1 OR m3
out = df[m1 | m3]
Or, for a variant without groupby
, using set
operations:
# identify rows with 1/-1
m1 = df['b'].eq(1)
m2 = df['b'].eq(-1)
# drop c that have values other that 1/-1: {65}
# drop -1 groups after 2nd occurrence: {30}
drop = set(df.loc[~(m1|m2), 'c']) | set(df.loc[m2, 'c'].unique()[2:])
out = df[~df['c'].isin(drop)]
Output:
a b c
0 10 1 25
1 15 1 25
2 20 1 25
3 25 -1 45
4 30 -1 45
5 35 -1 45
6 40 -1 45
11 65 -1 40
12 70 -1 40
Intermediates (first approach):
a b c m1 m2 m3
0 10 1 25 True False False
1 15 1 25 True False False
2 20 1 25 True False False
3 25 -1 45 False True True
4 30 -1 45 False True True
5 35 -1 45 False True True
6 40 -1 45 False True True
7 45 2 65 False False False
8 50 2 65 False False False
9 55 2 65 False False False
10 60 2 65 False False False
11 65 -1 40 False True True
12 70 -1 40 False True True
13 10 -1 30 False True False
14 22 -1 30 False True False
Intermediates (second approach):
a b c m1 m2 ~isin(drop)
0 10 1 25 True False True
1 15 1 25 True False True
2 20 1 25 True False True
3 25 -1 45 False True True
4 30 -1 45 False True True
5 35 -1 45 False True True
6 40 -1 45 False True True
7 45 2 65 False False False
8 50 2 65 False False False
9 55 2 65 False False False
10 60 2 65 False False False
11 65 -1 40 False True True
12 70 -1 40 False True True
13 10 -1 30 False True False
14 22 -1 30 False True False
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With