Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas - how to create multiple columns in groupby with conditional?

I need to group a dataframe, but I need to create two columns, one that is a simple count and another that is a count with conditional, as in the example:

enter image description here

The qtd_ok column counts only those that have 'OK'

enter image description here

I tried this, but I do not know how to add the total count in the same groupby:

df.groupby(['column1', 'column2', 'column3']).apply(lambda x : x['status'].sum() == 'OK')
like image 646
Hiago Bonamelli Avatar asked Sep 14 '25 11:09

Hiago Bonamelli


1 Answers

First create helper column A with assign and then aggregate by agg functions sum for count only OK values and size for count all values per groups:

df = (df.assign(A=(df['status']== 'OK'))
        .groupby(['column1', 'column2', 'column3'])['A']
        .agg([('qtd_ok','sum'),('qtd','size')])
        .astype(int)
        .reset_index())

Sample:

df = pd.DataFrame({
        'column1':['a'] * 9,
        'column2':['a'] * 4 + ['b'] * 5,
        'column3':list('aaabaabbb'),
        'status':list('aabaaabba'),
})

print (df)
  column1 column2 column3 status
0       a       a       a      a
1       a       a       a      a
2       a       a       a      b
3       a       a       b      a
4       a       b       a      a
5       a       b       a      a
6       a       b       b      b
7       a       b       b      b
8       a       b       b      a

df = (df.assign(A=(df['status']== 'a'))
        .groupby(['column1', 'column2', 'column3'])['A']
        .agg([('qtd_ok','sum'),('qtd','size')])
        .astype(int)
        .reset_index())
print (df)
  column1 column2 column3  qtd_ok  qtd
0       a       a       a       2    3
1       a       a       b       1    1
2       a       b       a       2    2
3       a       b       b       1    3
like image 129
jezrael Avatar answered Sep 16 '25 00:09

jezrael