Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate cumulative sum from last non-zero entry in python

I have a numeric series like [0,0,0,0,1,1,1,0,0,1,1,0]. I would like to calculate the numeric sum from the last non-zero values. i.e the cumsum will be reset to zero once a zero entry occurs.

input: [0,0,0,0,1,1,1,0,0,1,1,0]
output:[0,0,0,0,1,2,3,0,0,1,2,0] 

Is there a built-in python function able to achieve this? Or better way to calculate it without loop?

like image 653
AAA Avatar asked Jan 01 '23 21:01

AAA


2 Answers

You can do it with itertools.accumulate. In addition to passing an iterable as the first argument, it accepts an optional 2nd argument that should be a 2 argument function where the first argument is the accumulated result and the second argument is the current element from the iterable. You can pass a fairly simple lambda as the optional 2nd argument to calculate the running total unless the current element is zero.

from itertools import accumulate

nums = [0,0,0,0,1,1,1,0,0,1,1,0]

result = accumulate(nums, lambda acc, elem: acc + elem if elem else 0)
print(list(result))
# [0, 0, 0, 0, 1, 2, 3, 0, 0, 1, 2, 0]
like image 115
benvc Avatar answered Jan 03 '23 11:01

benvc


We can do this in numpy with two passes of np.cumsum(..). First we calculate the cumsum of the array:

a = np.array([0,0,0,0,1,1,1,0,0,1,1,0])
c = np.cumsum(a)

This gives us:

>>> c
array([0, 0, 0, 0, 1, 2, 3, 3, 3, 4, 5, 5])

Next we filter a on elements where the value is 0 and we elementwise calculate the difference between that element and its predecessor:

corr = np.diff(np.hstack(((0,), c[a == 0])))

then this is the correction we need to apply on those elements:

>>> corr
array([0, 0, 0, 0, 3, 0, 2])

We can then make a copy of a (or do this inplace), and subtract the correction:

a2 = a.copy()
a2[a == 0] -= corr

this gives us:

>>> a2
array([ 0,  0,  0,  0,  1,  1,  1, -3,  0,  1,  1, -2])

and now we can calculate the cummulative sum of a2 that will reset to 0 for an 0, since the correction keeps track of the increments in between:

>>> a2.cumsum()
array([0, 0, 0, 0, 1, 2, 3, 0, 0, 1, 2, 0])

or as a function:

import numpy as np

def cumsumreset(iterable, reset=0):
    a = np.array(iterable)
    c = a.cumsum()
    a2 = a.copy()
    filter = a == reset
    a2[filter] -= np.diff(np.hstack(((0,), c[filter])))
    return a2.cumsum()

this then gives us:

>>> cumsumreset([0,0,0,0,1,1,1,0,0,1,1,0])
array([0, 0, 0, 0, 1, 2, 3, 0, 0, 1, 2, 0])
like image 41
Willem Van Onsem Avatar answered Jan 03 '23 09:01

Willem Van Onsem