Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I maximize efficiency with numpy arrays?

I am just getting to know numpy, and I am impressed by its claims of C-like efficiency with memory access in its ndarrays. I wanted to see the differences between these and pythonic lists for myself, so I ran a quick timing test, performing a few of the same simple tasks with numpy without it. Numpy outclassed regular lists by an order of magnitude in the allocation of and arithmetic operations on arrays, as expected. But this segment of code, identical in both tests, took about 1/8 of a second with a regular list, and slightly over 2.5 seconds with numpy:

file = open('timing.log','w')
for num in a2:
    if num % 1000 == 0:
        file.write("Multiple of 1000!\r\n")

file.close()

Does anyone know why this might be, and if there is some other syntax i should be using for operations like this to take better advantage of what the ndarray can do?

Thanks...

EDIT: To answer Wayne's comment... I timed them both repeatedly and in different orders and got pretty much identical results each time, so I doubt it's another process. I put

start = time()
at the top of the file after the numpy import and then I have statements like
print 'Time after traversal:\t',(time() - start)
throughout.
like image 421
pr0crastin8r Avatar asked Aug 03 '10 18:08

pr0crastin8r


1 Answers

a2 is a NumPy array, right? One possible reason it might be taking so long in NumPy (if other processes' activity don't account for it as Wayne Werner suggested) is that you're iterating over the array using a Python loop. At every step of the iteration, Python has to fetch a single value out of the NumPy array and convert it to a Python integer, which is not a particularly fast operation.

NumPy works much better when you are able to perform operations on the whole array as a unit. In your case, one option (maybe not even the fastest) would be

file.write("Multiple of 1000!\r\n" * (a2 % 1000 == 0).sum())

Try comparing that to the pure-Python equivalent,

file.write("Multiple of 1000!\r\n" * sum(filter(lambda i: i % 1000 == 0, a2)))

or

file.write("Multiple of 1000!\r\n" * sum(1 for i in a2 if i % 1000 == 0))
like image 173
David Z Avatar answered Sep 21 '22 04:09

David Z