I have a dataset, df, where I would like to groupby two columns, take the sum and count of another column as well as list the strings in a separate column
Data
id date pwr type
aa q321 10 hey
aa q321 1 hello
aa q425 20 hi
aa q425 20 no
bb q122 2 ok
bb q122 1 cool
bb q422 5 sure
bb q422 5 sure
bb q422 5 ok
Desired
id date pwr count type
aa q321 11 2 hey
hello
aa q425 40 2 hi
no
bb q122 3 2 ok
cool
bb q422 15 3 sure
sure
ok
Doing
g = df.groupby(['id', 'date'])['pwr'].sum().reset_index()
g['count'] = g['id'].map(df['id'].value_counts())
This works ok, except, I am not sure how to display the string output of column 'type' Any suggestion is appreciated.
You can use .GroupBy.transform()
to set the values for columns pwr
and count
. Then .set_index()
on the 4 columns except type
to get a layout similar to the desired output:
df['pwr'] = df.groupby(['id', 'date'])['pwr'].transform('sum')
df['count'] = df.groupby(['id', 'date'])['pwr'].transform('count')
df.set_index(['id', 'date', 'pwr', 'count'])
Output:
type
id date pwr count
aa q321 11 2 hey
2 hello
q425 40 2 hi
2 no
bb q122 3 2 ok
2 cool
q422 15 3 sure
3 sure
3 ok
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With