Pandas dataframe - running sum with reset

Tags:

I want to calculate the running sum in a given column(without using loops, of course). The caveat is that I have this other column that specifies when to reset the running sum to the value present in that row. Best explained by the following example:

   reset  val   desired_col 0      0    1   1 1      0    5   6 2      0    4   10 3      1    2   2 4      1   -1   -1 5      0    6   5 6      0    4   9 7      1    2   2

desired_col is the value I want to be calculated.

922

asked Oct 01 '15 14:10

Baron Yugovich

1 Answers

You can use 2 times cumsum():

#   reset  val  desired_col #0      0    1            1 #1      0    5            6 #2      0    4           10 #3      1    2            2 #4      1   -1           -1 #5      0    6            5 #6      0    4            9 #7      1    2            2 df['cumsum'] = df['reset'].cumsum() #cumulative sums of groups to column des df['des']= df.groupby(['cumsum'])['val'].cumsum() print df #   reset  val  desired_col  cumsum  des #0      0    1            1       0    1 #1      0    5            6       0    6 #2      0    4           10       0   10 #3      1    2            2       1    2 #4      1   -1           -1       2   -1 #5      0    6            5       2    5 #6      0    4            9       2    9 #7      1    2            2       3    2 #remove columns desired_col and cumsum df = df.drop(['desired_col', 'cumsum'], axis=1) print df #   reset  val  des #0      0    1    1 #1      0    5    6 #2      0    4   10 #3      1    2    2 #4      1   -1   -1 #5      0    6    5 #6      0    4    9 #7      1    2    2

137

answered Sep 21 '22 06:09

jezrael

Related questions
                            
                                How do convert unicode escape sequences to unicode characters in a python string
                            
                                Online IDE for Python [closed]
                            
                                Sending custom PyQt signals?
                            
                                Running Scapy on Windows with Python 2.7
                            
                                how to uniqify a list of dict in python
                            
                                Django IntegerField with Choice Options (how to create 0-10 integer options)
                            
                                Change value of currently iterated element in list
                            
                                Writing a Python list into a single CSV column
                            
                                Efficient way to normalize a Scipy Sparse Matrix
                            
                                Check if object is list of list in python?
                            
                                How can you turn an index array into a mask array in Numpy?
                            
                                Pytest: run a function at the end of the tests
                            
                                How to pip install old version of library(tensorflow)?
                            
                                Emacs - tab-completion of local Python variables
                            
                                "select" on multiple Python multiprocessing Queues?
                            
                                Python: NameError: global name 'foobar' is not defined [duplicate]
                            
                                Difference between plt.close() and plt.clf()
                            
                                Updating openssl in python 2.7
                            
                                Python equivalent of the R operator "%in%"
                            
                                Why can't Python see environment variables? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas dataframe - running sum with reset

Tags:

python

pandas

dataframe

multiple-columns

cumsum

Baron Yugovich

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us