Vectorized searchsorted numpy

Question

Assume that I have two arrays A and B, where both A and B are m x n. My goal is now, for each row of A and B, to find where I should insert the elements of row i of A in the corresponding row of B. That is, I wish to apply np.digitize or np.searchsorted to each row of A and B.

My naive solution is to simply iterate over the rows. However, this is far too slow for my application. My question is therefore: is there a vectorized implementation of either algorithm that I haven't managed to find?

Divakar · Accepted Answer

We can add each row some offset as compared to the previous row. We would use the same offset for both arrays. The idea is to use np.searchsorted on flattened version of input arrays thereafter and thus each row from b would be restricted to find sorted positions in the corresponding row in a. Additionally, to make it work for negative numbers too, we just need to offset for the minimum numbers as well.

So, we would have a vectorized implementation like so -

def searchsorted2d(a,b):
    m,n = a.shape
    max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
    r = max_num*np.arange(a.shape[0])[:,None]
    p = np.searchsorted( (a+r).ravel(), (b+r).ravel() ).reshape(m,-1)
    return p - n*(np.arange(m)[:,None])

Runtime test -

In [173]: def searchsorted2d_loopy(a,b):
     ...:     out = np.zeros(a.shape,dtype=int)
     ...:     for i in range(len(a)):
     ...:         out[i] = np.searchsorted(a[i],b[i])
     ...:     return out
     ...: 

In [174]: # Setup input arrays
     ...: a = np.random.randint(11,99,(10000,20))
     ...: b = np.random.randint(11,99,(10000,20))
     ...: a = np.sort(a,1)
     ...: b = np.sort(b,1)
     ...: 

In [175]: np.allclose(searchsorted2d(a,b),searchsorted2d_loopy(a,b))
Out[175]: True

In [176]: %timeit searchsorted2d_loopy(a,b)
10 loops, best of 3: 28.6 ms per loop

In [177]: %timeit searchsorted2d(a,b)
100 loops, best of 3: 13.7 ms per loop

Vectorized searchsorted numpy

Tags:

performance

python

vectorization

numpy

Tingiskhan

1 Answers

Divakar

Recent Activity

Donate For Us

Vectorized searchsorted numpy

Tags:

performance

python

vectorization

numpy

Tingiskhan

1 Answers

Divakar

Related questions

Recent Activity

Donate For Us