Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A fast numpy way to find index in array where cumulative sum becomes greater?

Tags:

Basically, the logic of my problem is:

running_sum = my_array.cumsum()
greater_than_threshold = running_sum > threshold
index = greater_than_threshold.searchsorted(True)

That is: Find the first index for which the cumulative sum of entries in my_array is above a threshold.

Now the problem is: I know that my_array will be large, but that the condition will be met fairly early. Of course that means I could just do a simple while loop to manually figure out when the cumulative sum is larger than the threshold, but I am wondering if there's a numpythonic way, i.e., a way to test for some condition without having the entire array evaluated.

like image 201
Lagerbaer Avatar asked Jul 27 '16 18:07

Lagerbaer


People also ask

How do you find the index of an element in an array in Numpy?

Using ndenumerate() function to find the Index of value It is usually used to find the first occurrence of the element in the given numpy array.

What is the purpose of using the Cumsum method of Numpy?

cumsum() function is used when we want to compute the cumulative sum of array elements over a given axis. Parameters : arr : [array_like] Array containing numbers whose cumulative sum is desired.

What does Numpy Cumsum return?

cumsum. Return the cumulative sum of the elements along a given axis.


1 Answers

EDIT: This method is slower than using NumPy's searchsorted and cumsum, see user2357112's comments and timeit test.

cumsum will calculate cumulative sums for the entire array. Instead, just iterate over the array yourself:

running_sum = 0
for index, entry in enumerate(my_array.flat):
    running_sum += entry
    if running_sum > threshold:
        break
if running_sum < threshold:
    index = -1 #if the sum never reaches the threshold
like image 198
Jonathan Jeffrey Avatar answered Sep 28 '22 03:09

Jonathan Jeffrey