Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groupby two columns, sum, count and display output values in separate column (pandas)

Tags:

I have a dataset, df, where I would like to groupby two columns, take the sum and count of another column as well as list the strings in a separate column

Data

id  date    pwr type
aa  q321    10  hey
aa  q321    1   hello
aa  q425    20  hi
aa  q425    20  no
bb  q122    2   ok
bb  q122    1   cool
bb  q422    5   sure
bb  q422    5   sure
bb  q422    5   ok

Desired

id  date    pwr count   type
aa  q321    11  2       hey
                        hello
aa  q425    40  2       hi
                        no
bb  q122    3   2       ok
                        cool
bb  q422    15  3       sure
                        sure
                        ok

Doing

g = df.groupby(['id', 'date'])['pwr'].sum().reset_index()
g['count'] = g['id'].map(df['id'].value_counts())

This works ok, except, I am not sure how to display the string output of column 'type' Any suggestion is appreciated.

like image 298
Lynn Avatar asked Jul 22 '21 15:07

Lynn


1 Answers

You can use .GroupBy.transform() to set the values for columns pwr and count. Then .set_index() on the 4 columns except type to get a layout similar to the desired output:

df['pwr'] = df.groupby(['id', 'date'])['pwr'].transform('sum')
df['count'] = df.groupby(['id', 'date'])['pwr'].transform('count')

df.set_index(['id', 'date', 'pwr', 'count'])

Output:

                    type
id date pwr count       
aa q321 11  2        hey
            2      hello
   q425 40  2         hi
            2         no
bb q122 3   2         ok
            2       cool
   q422 15  3       sure
            3       sure
            3         ok
like image 84
SeaBean Avatar answered Oct 12 '22 21:10

SeaBean