Lets say I have a table that look like this:
Company Region Date Count Amount
AAA XXY 3-4-2018 766 8000
AAA XXY 3-14-2018 766 8600
AAA XXY 3-24-2018 766 2030
BBB XYY 2-4-2018 66 3400
BBB XYY 3-18-2018 66 8370
BBB XYY 4-6-2018 66 1380
I want to get rid of the Date column, then aggregate by Company AND region to find the average of Count and sum of Amount.
Expected output:
Company Region Count Amount
AAA XXY 766 18630
BBB XYY 66 13150
I looked into this post here, and many other posts online, but seems like they are only performing one kind of aggregation action (for example, I can aggregate by multiple columns but can only produce one column output as sum OR count, NOT sum AND count)
Rename result columns from Pandas aggregation ("FutureWarning: using a dict with renaming is deprecated")
Can someone help?
What I did:
I followed this post here:
https://www.shanelynn.ie/summarising-aggregation-and-grouping-data-in-python-pandas/
however, when i try to use the method presented in this article (toward the end of the article), by using dictionary:
aggregation = {
'Count': {
'Total Count': 'mean'
},
'Amount': {
'Total Amount': 'sum'
}
}
I would get this warning:
FutureWarning: using a dict with renaming is deprecated and will be removed in a future version
return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)
I know it works now but i want to make sure my script works later too. How can I update my code to be compatible in the future?
Need aggregate by single non nested dictionary and then rename
columns:
aggregation = {'Count': 'mean', 'Amount': 'sum'}
cols_d = {'Count': 'Total Count', 'Amount': 'Total Amount'}
df = df.groupby(['Company','Region'], as_index=False).agg(aggregation).rename(columns=cols_d)
print (df)
Company Region Total Count Total Amount
0 AAA XXY 766 18630
1 BBB XYY 66 13150
Another solution with add_prefix
instead rename
:
aggregation = {'Count': 'mean', 'Amount': 'sum'}
df = df.groupby(['Company','Region']).agg(aggregation).add_prefix('Total ').reset_index()
print (df)
Company Region Total Count Total Amount
0 AAA XXY 766 18630
1 BBB XYY 66 13150
df.groupby(['Region', 'Company']).agg({'Count': 'mean', 'Amount': 'sum'}).reset_index()
outputs:
Region Company Count Amount
0 XXY AAA 766 18630
1 XYY BBB 66 13150
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With