I have a numpy/pandas list of values:
a = np.random.randint(-100, 100, 10000)
b = a/100
I want to apply a custom cumsum function, but I haven't found a way to do it without loops. The custom function sets an upper limit of 1 and lower limit of -1 for the cumsum values, if the "add" to sum is beyond these limits the "add" becomes 0.
In the case that sum is between the limits of -1 and 1 but the "added" value would break beyond the limits, the "added" becomes the remainder to -1 or 1.
Here is the loop version:
def cumsum_with_limits(values):
cumsum_values = []
sum = 0
for i in values:
if sum+i <= 1 and sum+i >= -1:
sum += i
cumsum_values.append(sum)
elif sum+i >= 1:
d = 1-sum # Remainder to 1
sum += d
cumsum_values.append(sum)
elif sum+i <= -1:
d = -1-sum # Remainder to -1
sum += d
cumsum_values.append(sum)
return cumsum_values
Is there any way to vectorize this? I need to run this function on large datasets and performance is my current issue. Appreciate any help!
Update: Fixed the code a bit, and a little clarification for the outputs: Using np.random.seed(0), the first 6 values are:
b = [0.72, -0.53, 0.17, 0.92, -0.33, 0.95]
Expected output:
o = [0.72, 0.19, 0.36, 1, 0.67, 1]
Loops aren't necessarily undesirable. If performance is an issue, consider numba
. There's a ~330x improvement without materially changing your logic:
from numba import njit
np.random.seed(0)
a = np.random.randint(-100, 100, 10000)
b = a/100
@njit
def cumsum_with_limits_nb(values):
n = len(values)
res = np.empty(n)
sum_val = 0
for i in range(n):
x = values[i]
if (sum_val+x <= 1) and (sum_val+x >= -1):
res[i] = x
sum_val += x
elif sum_val+x >= 1:
d = 1-sum_val # Remainder to 1
res[i] = d
sum_val += d
elif sum_val+x <= -1:
d = -1-sum_val # Remainder to -1
res[i] = d
sum_val += d
return res
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb(b)).all()
If you don't mind sacrificing some performance, you can rewrite this loop more succinctly:
@njit
def cumsum_with_limits_nb2(values):
n = len(values)
res = np.empty(n)
sum_val = 0
for i in range(n):
x = values[i]
next_sum = sum_val + x
if np.abs(next_sum) >= 1:
x = np.sign(next_sum) - sum_val
res[i] = x
sum_val += x
return res
With similar performance to nb2
, here's an alternative (thanks to @jdehesa):
@njit
def cumsum_with_limits_nb3(values):
n = len(values)
res = np.empty(n)
sum_val = 0
for i in range(n):
x = min(max(sum_val + values[i], -1) , 1) - sum_val
res[i] = x
sum_val += x
return res
Performance comparisons:
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb(b)).all()
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb2(b)).all()
assert np.isclose(cumsum_with_limits(b), cumsum_with_limits_nb3(b)).all()
%timeit cumsum_with_limits(b) # 12.5 ms per loop
%timeit cumsum_with_limits_nb(b) # 40.9 µs per loop
%timeit cumsum_with_limits_nb2(b) # 54.7 µs per loop
%timeit cumsum_with_limits_nb3(b) # 54 µs per loop
Start with a regular cumsum:
b = ...
s = np.cumsum(b)
Find the first clip point:
i = np.argmax((s[0:] > 1) | (s[0:] < -1))
Adjust everything that follows:
s[i:] += (np.sign(s[i]) - s[i])
Rinse and repeat. This still requires a loop, but only over the adjustment points, which is generally expected to be much smaller than the total number of array size.
b = ...
s = np.cumsum(b)
while True:
i = np.argmax((s[0:] > 1) | (s[0:] < -1))
if np.abs(s[i]) <= 1:
break
s[i:] += (np.sign(s[i]) - s[i])
I still haven't found a way to completely pre-compute the adjustment points up front, so I would have to guess that the numba solution will be faster than this, even if it you compiled this with numba.
Starting with np.seed(0)
, your original example has 3090 adjustment points, which is approximately 1/3. Unfortunately, with all the temp arrays and extra sums, that makes the algorithmic complexity of my solution tend to O(n2). This is completely unacceptable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With