I have found Pandas groupby cumulative sum and found it very useful. However, I would like to determine how to calculate a reverse cumulative sum.
The link suggests the following.
df.groupby(by=['name','day']).sum().groupby(level=[0]).cumsum()
In order to reverse sum, I tried slicing the data, but it fails.
df.groupby(by=['name','day']).ix[::-1, 'no'].sum().groupby(level=[0]).cumsum()
Jack | Monday | 10 | 90
Jack | Tuesday | 30 | 80
Jack | Wednesday | 50 | 50
Jill | Monday | 40 | 80
Jill | Wednesday | 40 | 40
EDIT: Based on feedback, I tried to implement the code and make the dataframe larger:
import pandas as pd
df = pd.DataFrame(
{'name': ['Jack', 'Jack', 'Jack', 'Jill', 'Jill'],
'surname' : ['Jones','Jones','Jones','Smith','Smith'],
'car' : ['VW','Mazda','VW','Merc','Merc'],
'country' : ['UK','US','UK','EU','EU'],
'year' : [1980,1980,1980,1980,1980],
'day': ['Monday', 'Tuesday','Wednesday','Monday','Wednesday'],
'date': ['2016-02-31','2016-01-31','2016-01-31','2016-01-31','2016-01-31'],
'no': [10,30,50,40,40],
'qty' : [100,500,200,433,222]})
I then try and group on a number of columns but it fails to apply the grouping.
df = df.groupby(by=['name','surname','car','country','year','day','date']).sum().iloc[::-1].groupby(level=[0]).cumsum().iloc[::-1].reset_index()
Why is the case? I expect Jack Jones with car Mazda to be a separate cumulative quantity from Jack Jones with a VW.
Pandas Series: cumsum() functionThe cumsum() function is used to get cumulative sum over a DataFrame or Series axis. Returns a DataFrame or Series of the same size containing the cumulative sum. The index or the name of the axis. 0 is equivalent to None or 'index'.
From the docs: "NA groups in GroupBy are automatically excluded".
groupby(). sum() to group rows based on one or multiple columns and calculate sum agg function. groupby() function returns a DataFrameGroupBy object which contains an aggregate function sum() to calculate a sum of a given column for each group.
Groupby preserves the order of rows within each group.
You can use double iloc
:
df = df.groupby(by=['name','day']).sum().iloc[::-1].groupby(level=[0]).cumsum().iloc[::-1]
print (df)
no
name day
Jack Monday 90
Tuesday 80
Wednesday 50
Jill Monday 80
Wednesday 40
For another column solution is simplify:
df = df.groupby(by=['name','day']).sum()
df['new'] = df.iloc[::-1].groupby(level=[0]).cumsum()
print (df)
no new
name day
Jack Monday 10 90
Tuesday 30 80
Wednesday 50 50
Jill Monday 40 80
Wednesday 40 40
EDIT:
There is problem in second groupby
need to append more levels - level=[0,1,2]
means group by first name
, second surname
and third car
levels.
df1 = (df.groupby(by=['name','surname','car','country','year','day','date'])
.sum())
print (df1)
no qty
name surname car country year day date
Jack Jones Mazda US 1980 Tuesday 2016-01-31 30 500
VW UK 1980 Monday 2016-02-31 10 100
Wednesday 2016-01-31 50 200
Jill Smith Merc EU 1980 Monday 2016-01-31 40 433
Wednesday 2016-01-31 40 222
df2 = (df.groupby(by=['name','surname','car','country','year','day','date'])
.sum()
.iloc[::-1]
.groupby(level=[0,1,2])
.cumsum()
.iloc[::-1]
.reset_index())
print (df2)
name surname car country year day date no qty
0 Jack Jones Mazda US 1980 Tuesday 2016-01-31 30 500
1 Jack Jones VW UK 1980 Monday 2016-02-31 60 300
2 Jack Jones VW UK 1980 Wednesday 2016-01-31 50 200
3 Jill Smith Merc EU 1980 Monday 2016-01-31 80 655
4 Jill Smith Merc EU 1980 Wednesday 2016-01-31 40 222
Or is possible select by names - see groupby enhancements in 0.20.1+:
df2 = (df.groupby(by=['name','surname','car','country','year','day','date'])
.sum()
.iloc[::-1]
.groupby(['name','surname','car'])
.cumsum()
.iloc[::-1]
.reset_index())
print (df2)
name surname car country year day date no qty
0 Jack Jones Mazda US 1980 Tuesday 2016-01-31 30 500
1 Jack Jones VW UK 1980 Monday 2016-02-31 60 300
2 Jack Jones VW UK 1980 Wednesday 2016-01-31 50 200
3 Jill Smith Merc EU 1980 Monday 2016-01-31 80 655
4 Jill Smith Merc EU 1980 Wednesday 2016-01-31 40 222
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With