Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy custom Cumsum function with upper/lower limits?

I have a numpy/pandas list of values:

a = np.random.randint(-100, 100, 10000)
b = a/100

I want to apply a custom cumsum function, but I haven't found a way to do it without loops. The custom function sets an upper limit of 1 and lower limit of -1 for the cumsum values, if the "add" to sum is beyond these limits the "add" becomes 0.

In the case that sum is between the limits of -1 and 1 but the "added" value would break beyond the limits, the "added" becomes the remainder to -1 or 1.

Here is the loop version:

def cumsum_with_limits(values):
    cumsum_values = []
    sum = 0
    for i in values:
        if sum+i <= 1 and sum+i >= -1: 
            sum += i
            cumsum_values.append(sum)
        elif sum+i >= 1:
            d = 1-sum # Remainder to 1
            sum += d
            cumsum_values.append(sum)
        elif sum+i <= -1:
            d = -1-sum # Remainder to -1
            sum += d
            cumsum_values.append(sum)

    return cumsum_values

Is there any way to vectorize this? I need to run this function on large datasets and performance is my current issue. Appreciate any help!


Update: Fixed the code a bit, and a little clarification for the outputs: Using np.random.seed(0), the first 6 values are:

b = [0.72, -0.53, 0.17, 0.92, -0.33, 0.95]

Expected output:

o = [0.72, 0.19, 0.36, 1, 0.67, 1]
like image 418
Franc Weser Avatar asked Nov 01 '18 15:11

Franc Weser


2 Answers

Loops aren't necessarily undesirable. If performance is an issue, consider numba. There's a ~330x improvement without materially changing your logic:

from numba import njit

np.random.seed(0)
a = np.random.randint(-100, 100, 10000)
b = a/100

@njit
def cumsum_with_limits_nb(values):
    n = len(values)
    res = np.empty(n)
    sum_val = 0
    for i in range(n):
        x = values[i]
        if (sum_val+x <= 1) and (sum_val+x >= -1):
            res[i] = x
            sum_val += x
        elif sum_val+x >= 1:
            d = 1-sum_val # Remainder to 1
            res[i] = d
            sum_val += d
        elif sum_val+x <= -1:
            d = -1-sum_val # Remainder to -1
            res[i] = d
            sum_val += d
    return res

assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb(b)).all()

If you don't mind sacrificing some performance, you can rewrite this loop more succinctly:

@njit
def cumsum_with_limits_nb2(values):
    n = len(values)
    res = np.empty(n)
    sum_val = 0
    for i in range(n):
        x = values[i]
        next_sum = sum_val + x
        if np.abs(next_sum) >= 1:
            x = np.sign(next_sum) - sum_val
        res[i] = x
        sum_val += x
    return res

With similar performance to nb2, here's an alternative (thanks to @jdehesa):

@njit
def cumsum_with_limits_nb3(values):
    n = len(values)
    res = np.empty(n)
    sum_val = 0
    for i in range(n):
        x = min(max(sum_val + values[i], -1) , 1) - sum_val
        res[i] = x
        sum_val += x
    return res

Performance comparisons:

assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb(b)).all()
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb2(b)).all()
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb3(b)).all()

%timeit cumsum_with_limits(b)      # 12.5 ms per loop
%timeit cumsum_with_limits_nb(b)   # 40.9 µs per loop
%timeit cumsum_with_limits_nb2(b)  # 54.7 µs per loop
%timeit cumsum_with_limits_nb3(b)  # 54 µs per loop
like image 181
jpp Avatar answered Nov 09 '22 23:11

jpp


Start with a regular cumsum:

b = ...
s = np.cumsum(b)

Find the first clip point:

i = np.argmax((s[0:] > 1) | (s[0:] < -1))

Adjust everything that follows:

s[i:] += (np.sign(s[i]) - s[i])

Rinse and repeat. This still requires a loop, but only over the adjustment points, which is generally expected to be much smaller than the total number of array size.

b = ...
s = np.cumsum(b)
while True:
    i = np.argmax((s[0:] > 1) | (s[0:] < -1))
    if np.abs(s[i]) <= 1:
        break
    s[i:] += (np.sign(s[i]) - s[i])

I still haven't found a way to completely pre-compute the adjustment points up front, so I would have to guess that the numba solution will be faster than this, even if it you compiled this with numba.

Starting with np.seed(0), your original example has 3090 adjustment points, which is approximately 1/3. Unfortunately, with all the temp arrays and extra sums, that makes the algorithmic complexity of my solution tend to O(n2). This is completely unacceptable.

like image 35
Mad Physicist Avatar answered Nov 10 '22 00:11

Mad Physicist