Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast numpy fancy indexing

My code for slicing a numpy array (via fancy indexing) is very slow. It is currently a bottleneck in program.

a.shape
(3218, 6)

ts = time.time(); a[rows][:, cols]; te = time.time(); print('%.8f' % (te-ts));
0.00200009

What is the correct numpy call to get an array consisting of the subset of rows 'rows' and columns 'col' of the matrix a? (in fact, I need the transpose of this result)

like image 537
Oren Avatar asked Jan 17 '13 19:01

Oren


People also ask

Is NumPy indexing fast?

Furthermore, if the index array has the same shape as the original array, the elements corresponding to True will be selected and put in the resulting array. Indexing in NumPy is a reasonably fast operation. Anyway, when speed is critical, you can use the, slightly faster, numpy.

How can I make my NumPy code faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

What is NumPy fancy indexing?

Fancy indexing is conceptually simple: it means passing an array of indices to access multiple array elements at once. For example, consider the following array: import numpy as np rand = np. random. RandomState(42) x = rand.

Is NumPy indexing slow?

Numpy is fast with operating on vectors (matrices), when performed on the whole structure at once. Such single element-by-element operations are slow.


2 Answers

Let my try to summarize the excellent answers by Jaime and TheodrosZelleke and mix in some comments.

  1. Advanced (fancy) indexing always returns a copy, never a view.
  2. a[rows][:,cols] implies two fancy indexing operations, so an intermediate copy a[rows] is created and discarded. Handy and readable, but not very efficient. Moreover beware that [:,cols] usually generates a Fortran contiguous copy form a C-cont. source.
  3. a[rows.reshape(-1,1),cols] is a single advanced indexing expression basing on the fact that rows.reshape(-1,1) and cols are broadcast to the shape of the intended result.
  4. A common experience is that indexing in a flattened array can be more efficient than fancy indexing, so another approach is

    indx = rows.reshape(-1,1)*a.shape[1] + cols
    a.take(indx)
    

    or

    a.take(indx.flat).reshape(rows.size,cols.size)
    
  5. Efficiency will depend on memory access patterns and whether the starting array is C-countinous or Fortran continuous, so experimentation is needed.

  6. Use fancy indexing only if really needed: basic slicing a[rstart:rstop:rstep, cstart:cstop:cstep] returns a view (although not continuous) and should be faster!

like image 123
2 revs Avatar answered Sep 17 '22 17:09

2 revs


To my surprise this, kind of lenghty expression, which calculates first linear 1D-indices, is more than 50% faster than the consecutive array indexing presented in the question:

(a.ravel()[(
   cols + (rows * a.shape[1]).reshape((-1,1))
   ).ravel()]).reshape(rows.size, cols.size)

UPDATE: OP updated the description of the shape of the initial array. With the updated size the speedup is now above 99%:

In [93]: a = np.random.randn(3218, 1415)

In [94]: rows = np.random.randint(a.shape[0], size=2000)

In [95]: cols = np.random.randint(a.shape[1], size=6)

In [96]: timeit a[rows][:, cols]
10 loops, best of 3: 186 ms per loop

In [97]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
1000 loops, best of 3: 1.56 ms per loop

INITAL ANSWER: Here is the transcript:

In [79]: a = np.random.randn(3218, 6)
In [80]: a.shape
Out[80]: (3218, 6)

In [81]: rows = np.random.randint(a.shape[0], size=2000)
In [82]: cols = np.array([1,3,4,5])

Time method 1:

In [83]: timeit a[rows][:, cols]
1000 loops, best of 3: 1.26 ms per loop

Time method 2:

In [84]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
1000 loops, best of 3: 568 us per loop

Check that results are actually the same:

In [85]: result1 = a[rows][:, cols]
In [86]: result2 = (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)

In [87]: np.sum(result1 - result2)
Out[87]: 0.0
like image 34
tzelleke Avatar answered Sep 21 '22 17:09

tzelleke