I am doing function where I am grouping by ID and summing the $ value associated with those IDs with this code for python:
df = df.groupby([' Id'], as_index=False, sort=False)[["Amount"]].sum();
but it doesnt rename the column. As such I tried doing this :
`df = df.groupby([' Id'], as_index=False, sort=False)`[["Amount"]].sum();.reset_index(name ='Total Amount')
but it gave me error that TypeError: reset_index() got an unexpected keyword argument 'name'
So I tried doing this finally following this post:Python Pandas Create New Column with Groupby().Sum()
df = df.groupby(['Id'])[["Amount"]].transform('sum');
but it still didnt work.
What am I doing wrong?
One way of renaming the columns in a Pandas Dataframe is by using the rename() function.
To create a new column for the output of groupby. sum(), we will first apply the groupby. sim() operation and then we will store this result in a new column.
Select a column, and then select Transform > Rename. You can also double-click the column header. Enter the new name.
Use DataFrame. groupby(). sum() to group rows based on one or multiple columns and calculate sum agg function. groupby() function returns a DataFrameGroupBy object which contains an aggregate function sum() to calculate a sum of a given column for each group.
I think you need remove parameter as_index=False
and use Series.reset_index
, because this parameter return df
and then DataFrame.reset_index
with parameter name
failed:
df = df.groupby('Id', sort=False)["Amount"].sum().reset_index(name ='Total Amount')
Or rename
column first:
d = {'Amount':'Total Amount'}
df = df.rename(columns=d).groupby('Id', sort=False, as_index=False)["Total Amount"].sum()
Sample:
df = pd.DataFrame({'Id':[1,2,2],'Amount':[10, 30,50]})
print (df)
Amount Id
0 10 1
1 30 2
2 50 2
df1 = df.groupby('Id', sort=False)["Amount"].sum().reset_index(name ='Total Amount')
print (df1)
Id Total Amount
0 1 10
1 2 80
d = {'Amount':'Total Amount'}
df1 = df.rename(columns=d).groupby('Id', sort=False, as_index=False)["Total Amount"].sum()
print (df1)
Id Total Amount
0 1 10
1 2 80
But if need new column with sum
in original df
use transform
and assign output to new column:
df['Total Amount'] = df.groupby('Id', sort=False)["Amount"].transform('sum')
print (df)
Amount Id Total Amount
0 10 1 10
1 30 2 80
2 50 2 80
import pandas as pd
# set up dataframe
df = pd.DataFrame({'colA':['a', 'a', 'a', 'b', 'b', 'c', 'c', 'd'],
'colB':['cat', 'cat', 'dog', 'cat', 'dog', 'cat', 'cat', 'dog'],
'colC':[1,2,3,4,4,5,6,7], })
print(df)
colA colB colC
0 a cat 1
1 a cat 2
2 a dog 3
3 b cat 4
4 b dog 4
5 c cat 5
6 c cat 6
7 d dog 7
# group on vals in column A
# get min (within groups) for column B
# get avg (within groups) for column C
df_agg = ( df.groupby(by=['colA'])
.agg({'colB':'min', 'colC':'mean'})
.rename(columns={'colB':'colB_grp_min', 'colC':'colC_grp_avg'})
)
print(df_agg)
min_colB avg_colC
colA
a cat 2.0
b cat 4.0
c cat 5.5
d dog 7.0
# if you want multiple aggregations on the same column, pass a list
# this will return a multiindex
# group on vals in column A
# get min (within groups) for column B
# get avg and max (within groups) for column C
df_agg2 = ( df.groupby(by=['colA'])
.agg({'colB':'min', 'colC':['mean', 'max']})
.rename(columns={'colB':'colB_grp_min', 'colC':'colC_grp_multi_index'})
)
print(df_agg2)
colB_grp_min colC_grp_multi_index
min mean max
colA
a cat 2.0 3
b cat 4.0 4
c cat 5.5 6
d dog 7.0 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With