Cumulative conditional count

Question

I have the following dataframe.

df = pd.DataFrame(
    {
        "drive": [1,1,2,2,2,3,3,3,4,4,4,5,5,6,6,7,7],
        "team": ['home','home','away','away','away','home','home','home','away',
                 'away','away','home','home','away','away','home','home'],
        "home_comfy_lead": [0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,1],
        "home_drives": [1,1,0,0,0,2,2,2,0,0,0,3,3,0,0,4,4],
        'home_drives_with_comfy_lead': [0,0,0,0,0,0,0,1,0,0,0,2,2,0,0,3,3]
    })

I am trying to make two columns:

A home_drives column that uniquely counts the drives from the drive column based on the 'home' designation from the team column.
A home_drives_with_comfy_lead column that uniquely counts the home_drives values based on whether home_comfy_lead is true.

My desired output is:

    drive  team  home_comfy_lead  home_drives  home_drives_with_comfy_lead
0       1  home                0            1                            0
1       1  home                0            1                            0
2       2  away                0            0                            0
3       2  away                0            0                            0
4       2  away                0            0                            0
5       3  home                0            2                            0
6       3  home                0            2                            0
7       3  home                1            2                            1
8       4  away                0            0                            0
9       4  away                0            0                            0
10      4  away                0            0                            0
11      5  home                1            3                            2
12      5  home                1            3                            2
13      6  away                0            0                            0
14      6  away                0            0                            0
15      7  home                1            4                            3
16      7  home                1            4                            3

Can anyone help with this? I've been struggling with this for a few days now.

ALollz · Accepted Answer

Use .where to mask and then groupby + ngroup. Here we get lucky that NaN group gets assigned -1 and you also want to start counting at 1, so adding +1 fixes both of those simultaneously.

df['home_drives'] = df.where(df.team == 'home').groupby('drive').ngroup()+1
df['hdwcl'] = df.where(df.home_comfy_lead == 1).groupby('home_drives').ngroup()+1

Output:

    drive  team  home_comfy_lead  home_drives  hdwcl
0       1  home                0            1      0
1       1  home                0            1      0
2       2  away                0            0      0
3       2  away                0            0      0
4       2  away                0            0      0
5       3  home                0            2      0
6       3  home                0            2      0
7       3  home                1            2      1
8       4  away                0            0      0
9       4  away                0            0      0
10      4  away                0            0      0
11      5  home                1            3      2
12      5  home                1            3      2
13      6  away                0            0      0
14      6  away                0            0      0
15      7  home                1            4      3
16      7  home                1            4      3

Cumulative conditional count

Tags:

python

pandas

numpy

pandas-groupby

data-science

bbk611

1 Answers

Output:

ALollz

Recent Activity

Donate For Us

Cumulative conditional count

Tags:

python

pandas

numpy

pandas-groupby

data-science

bbk611

1 Answers

Output:

ALollz

Related questions

Recent Activity

Donate For Us