Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast(er) numpy fancy indexing and reduction?

I'm trying to use and accelerate fancy indexing to "join" two arrays and sum over one of results' axis.

Something like this:

$ ipython
In [1]: import numpy as np
In [2]: ne, ds = 12, 6
In [3]: i = np.random.randn(ne, ds).astype('float32')
In [4]: t = np.random.randint(0, ds, size=(1e5, ne)).astype('uint8')

In [5]: %timeit i[np.arange(ne), t].sum(-1)
10 loops, best of 3: 44 ms per loop

Is there a simple way to accelerate the statement in In [5] ? Should I go with OpenMP and something like scipy.weave or Cython's prange ?

like image 814
npinto Avatar asked Aug 03 '12 16:08

npinto


1 Answers

numpy.take is much faster than fancy indexing for some reason. The only trick is that it treats the array as flat.

In [1]: a = np.random.randn(12,6).astype(np.float32)

In [2]: c = np.random.randint(0,6,size=(1e5,12)).astype(np.uint8)

In [3]: r = np.arange(12)

In [4]: %timeit a[r,c].sum(-1)
10 loops, best of 3: 46.7 ms per loop

In [5]: rr, cc = np.broadcast_arrays(r,c)

In [6]: flat_index = rr*a.shape[1] + cc

In [7]: %timeit a.take(flat_index).sum(-1)
100 loops, best of 3: 5.5 ms per loop

In [8]: (a.take(flat_index).sum(-1) == a[r,c].sum(-1)).all()
Out[8]: True

I think the only other way you're going to see much of a speed improvement beyond this would be to write a custom kernel for a GPU using something like PyCUDA.

like image 74
user545424 Avatar answered Oct 17 '22 08:10

user545424