Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Build numpy array with multiple custom index ranges without explicit loop

In Numpy, is there a pythonic way to create array3 with custom ranges from array1 and array2 without a loop? The straightforward solution of iterating over the ranges works but since my arrays run into millions of items, I am looking for a more efficient solution (maybe syntactic sugar too).

For ex.,

array1 = np.array([10, 65, 200]) 
array2 = np.array([14, 70, 204])
array3 = np.concatenate([np.arange(array1[i], array2[i]) for i in
                         np.arange(0,len(array1))])

print array3

result: [10,11,12,13,65,66,67,68,69,200,201,202,203].

like image 691
snowmonkey Avatar asked Jun 03 '16 23:06

snowmonkey


2 Answers

Assuming the ranges do not overlap, you could build a mask which is nonzero where the index is between the ranges specified by array1 and array2 and then use np.flatnonzero to obtain an array of indices -- the desired array3:

import numpy as np

array1 = np.array([10, 65, 200]) 
array2 = np.array([14, 70, 204])

first, last = array1.min(), array2.max()
array3 = np.zeros(last-first+1, dtype='i1')
array3[array1-first] = 1
array3[array2-first] = -1
array3 = np.flatnonzero(array3.cumsum())+first
print(array3)

yields

[ 10  11  12  13  65  66  67  68  69 200 201 202 203]

For large len(array1), using_flatnonzero can be significantly faster than using_loop:

def using_flatnonzero(array1, array2):
    first, last = array1.min(), array2.max()
    array3 = np.zeros(last-first+1, dtype='i1')
    array3[array1-first] = 1
    array3[array2-first] = -1
    return np.flatnonzero(array3.cumsum())+first

def using_loop(array1, array2):
    return np.concatenate([np.arange(array1[i], array2[i]) for i in
                           np.arange(0,len(array1))])


array1, array2 = (np.random.choice(range(1, 11), size=10**4, replace=True)
                  .cumsum().reshape(2, -1, order='F'))


assert np.allclose(using_flatnonzero(array1, array2), using_loop(array1, array2))

In [260]: %timeit using_loop(array1, array2)
100 loops, best of 3: 9.36 ms per loop

In [261]: %timeit using_flatnonzero(array1, array2)
1000 loops, best of 3: 564 µs per loop

If the ranges overlap, then using_loop will return an array3 which contains duplicates. using_flatnonzero returns an array with no duplicates.


Explanation: Let's look at a small example with

array1 = np.array([10, 65, 200]) 
array2 = np.array([14, 70, 204])

The objective is to build an array which looks like goal, below. The 1's are located at index values [ 10, 11, 12, 13, 65, 66, 67, 68, 69, 200, 201, 202, 203] (i.e. array3):

In [306]: goal
Out[306]: 
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1], dtype=int8)

Once we have the goal array, array3 can be obtained with a call to np.flatnonzero:

In [307]: np.flatnonzero(goal)
Out[307]: array([ 10,  11,  12,  13,  65,  66,  67,  68,  69, 200, 201, 202, 203])

goal has the same length as array2.max():

In [308]: array2.max()
Out[308]: 204

In [309]: goal.shape
Out[309]: (204,)

So we can begin by allocating

goal = np.zeros(array2.max()+1, dtype='i1')

and then filling in 1's at the index locations given by array1 and -1's at the indices given by array2:

In [311]: goal[array1] = 1
In [312]: goal[array2] = -1
In [313]: goal
Out[313]: 
array([ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0, -1,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,
        0,  0, -1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
        0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  0,  0,  0,
       -1], dtype=int8)

Now applying cumsum (the cumulative sum) produces the desired goal array:

In [314]: goal = goal.cumsum(); goal
Out[314]: 
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0])

In [315]: np.flatnonzero(goal)
Out[315]: array([ 10,  11,  12,  13,  65,  66,  67,  68,  69, 200, 201, 202, 203])

That's the main idea behind using_flatnonzero. The subtraction of first was simply to save a bit of memory.

like image 50
unutbu Avatar answered Sep 28 '22 07:09

unutbu


Prospective Approach

I will go backwards on how to approach this problem.

Take the sample listed in the question. We have -

array1 = np.array([10, 65, 200])
array2 = np.array([14, 70, 204])

Now, look at the desired result -

result: [10,11,12,13,65,66,67,68,69,200,201,202,203]

Let's calculate the group lengths, as we would be needing those to explain the solution approach next.

In [58]: lens = array2 - array1

In [59]: lens
Out[59]: array([4, 5, 4])

The idea is to use 1's initialized array, which when cumumlative summed across the entire length would give us the desired result. This cumumlative summation would be the last step to our solution. Why 1's initialized? Well, because we have an array that increasing in steps of 1's except at specific places where we have shifts corresponding to new groups coming in.

Now, since cumsum would be the last step, so the step before it should give us something like -

array([ 10,   1,   1,   1,  52,   1,   1,   1,   1, 131,   1,   1,   1])

As discussed before, it's 1's filled with [10,52,131] at specific places. That 10 seems to be coming in from the first element in array1, but what about the rest? The second one 52 came in as 65-13 (looking at the result) and in it 13 came in the group that started with 10 and ran because of the length of the first group 4. So, if we do 65 - 10 - 4, we will get 51 and then add 1 to accomodate for boundary stop, we would have 52, which is the desired shifting value. Similarly, we would get 131.

Thus, those shifting-values could be computed, like so -

In [62]: np.diff(array1) - lens[:-1]+1
Out[62]: array([ 52, 131])

Next up, to get those shifting-places where such shifts occur, we can simply do cumulative summation on the group lengths -

In [65]: lens[:-1].cumsum()
Out[65]: array([4, 9])

For completeness, we need to pre-append 0 with the array of shifting-places and array1[0] for shifting-values.

So, we are set to present our approach in a step-by-step format!


Putting back the pieces

1] Get lengths of each group :

lens = array2 - array1

2] Get indices at which shifts occur and values to be put in 1's initialized array :

shift_idx = np.hstack((0,lens[:-1].cumsum()))
shift_vals = np.hstack((array1[0],np.diff(array1) - lens[:-1]+1))

3] Setup 1's initialized ID array for inserting those values at those indices listed in the step before :

id_arr = np.ones(lens.sum(),dtype=array1.dtype)
id_arr[shift_idx] = shift_vals

4] Finally do cumulative summation on the ID array :

output = id_arr.cumsum() 

Listed in a function format, we would have -

def using_ones_cumsum(array1, array2):
    lens = array2 - array1
    shift_idx = np.hstack((0,lens[:-1].cumsum()))
    shift_vals = np.hstack((array1[0],np.diff(array1) - lens[:-1]+1))
    id_arr = np.ones(lens.sum(),dtype=array1.dtype)
    id_arr[shift_idx] = shift_vals
    return id_arr.cumsum()
  

And it works on overlapping ranges too!

In [67]: array1 = np.array([10, 11, 200]) 
    ...: array2 = np.array([14, 18, 204])
    ...: 

In [68]: using_ones_cumsum(array1, array2)
Out[68]: 
array([ 10,  11,  12,  13,  11,  12,  13,  14,  15,  16,  17, 200, 201,
       202, 203])

Runtime test

Let's time the proposed approach against the other vectorized approach in @unutbu's flatnonzero based solution, which already proved to be much better than the loopy approach -

In [38]: array1, array2 = (np.random.choice(range(1, 11), size=10**4, replace=True)
    ...:                   .cumsum().reshape(2, -1, order='F'))

In [39]: %timeit using_flatnonzero(array1, array2)
1000 loops, best of 3: 889 µs per loop

In [40]: %timeit using_ones_cumsum(array1, array2)
1000 loops, best of 3: 235 µs per loop

Improvement!

Now, codewise NumPy doesn't like appending. So, those np.hstack calls could be avoided for a slightly improved version as listed below -

def get_ranges_arr(starts,ends):
    counts = ends - starts
    counts_csum = counts.cumsum()
    id_arr = np.ones(counts_csum[-1],dtype=int)
    id_arr[0] = starts[0]
    id_arr[counts_csum[:-1]] = starts[1:] - ends[:-1] + 1
    return id_arr.cumsum()

Let's time it against our original approach -

In [151]: array1,array2 = (np.random.choice(range(1, 11),size=10**4, replace=True)\
     ...:                                      .cumsum().reshape(2, -1, order='F'))

In [152]: %timeit using_ones_cumsum(array1, array2)
1000 loops, best of 3: 276 µs per loop

In [153]: %timeit get_ranges_arr(array1, array2)
10000 loops, best of 3: 193 µs per loop

So, we have a 30% performance boost there!

like image 31
Divakar Avatar answered Sep 28 '22 06:09

Divakar