Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to speed up iteration over part of a numpy array

Tags:

python

numpy

I have a large 3 dimensional array in numpy (lets say size 100x100x100). I'd like to iterate over just parts of it many times (approx 70% of elements) and I have a boolean matrix that is the same size and defines whether the element should have the operation done or not.

My current method is to first to create an array "coords" of shape (N,3) which contains all the coords on which to do the operation and then

for i in np.arange(many_iterations):
    for j in coords:
        large_array[j] = do_something(large_array[tuple(j)])

Would it in fact be better to evaluate the whole array and include an extra operation in the loop to test the boolean array (bear in mind that the truth evaluation is then done many_iterations times rather than once). My thought was that the pay off in this case would be getting rid of the for loops

large_array = do_something(large_array if condition True)

How would this last line be made to work in this case?

like image 402
Rowan Avatar asked May 16 '13 15:05

Rowan


People also ask

How can I make numpy array faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

Is numpy concatenate faster than append?

In general it is better/faster to iterate or append with lists, and apply the np. array (or concatenate) just once. appending to a list is fast; much faster than making a new array.


2 Answers

You might get better performance by first creating an array of booleans that define where you should operate:

big_3d_arr = some 100x100x100 array
where_to_operate_arr = big_3d_arr < 500 # or whatever your condition is
big_3d_arr[where_to_operate_arr] = do_something(big_3d_arr[where_to_operate_arr])

Something like that might work, but again you may have to iterate and do the boolean indexing in chunks, depending on your application.

like image 186
mdscruggs Avatar answered Oct 02 '22 12:10

mdscruggs


You're basically trying to recreate masked arrays. This page gives a good introduction.

like image 28
tom10 Avatar answered Oct 02 '22 14:10

tom10