Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is numpy.array so slow?

I am baffled by this

def main():     for i in xrange(2560000):         a = [0.0, 0.0, 0.0]  main()  $ time python test.py  real     0m0.793s 

Let's now see with numpy:

import numpy  def main():     for i in xrange(2560000):         a = numpy.array([0.0, 0.0, 0.0])  main()  $ time python test.py  real    0m39.338s 

Holy CPU cycles batman!

Using numpy.zeros(3) improves, but still not enough IMHO

$ time python test.py  real    0m5.610s user    0m5.449s sys 0m0.070s 

numpy.version.version = '1.5.1'

If you are wondering if the list creation is skipped for optimization in the first example, it is not:

  5          19 LOAD_CONST               2 (0.0)              22 LOAD_CONST               2 (0.0)              25 LOAD_CONST               2 (0.0)              28 BUILD_LIST               3              31 STORE_FAST               1 (a) 
like image 329
Stefano Borini Avatar asked Jul 02 '11 20:07

Stefano Borini


People also ask

Is NumPy array slow?

NumPy random for generating an array of random numbers ndarray of 1000 random numbers. The reason why NumPy is fast when used right is that its arrays are extremely efficient. They are like C arrays instead of Python lists.

How can I make NumPy run faster?

By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.

Is NumPy array slower than list?

NumPy Arrays Are Faster Than Lists The array is randomly generated. As predicted, we can see that NumPy arrays are significantly faster than lists.

Is NumPy faster on GPU?

As you can see for small operations, NumPy performs better and as the size increases, tf-numpy provides better performance. And the performance on GPU is way better than its CPU counterpart.


2 Answers

Numpy is optimised for large amounts of data. Give it a tiny 3 length array and, unsurprisingly, it performs poorly.

Consider a separate test

import timeit  reps = 100  pythonTest = timeit.Timer('a = [0.] * 1000000') numpyTest = timeit.Timer('a = numpy.zeros(1000000)', setup='import numpy') uninitialised = timeit.Timer('a = numpy.empty(1000000)', setup='import numpy') # empty simply allocates the memory. Thus the initial contents of the array  # is random noise  print 'python list:', pythonTest.timeit(reps), 'seconds' print 'numpy array:', numpyTest.timeit(reps), 'seconds' print 'uninitialised array:', uninitialised.timeit(reps), 'seconds' 

And the output is

python list: 1.22042918205 seconds numpy array: 1.05412316322 seconds uninitialised array: 0.0016028881073 seconds 

It would seem that it is the zeroing of the array that is taking all the time for numpy. So unless you need the array to be initialised then try using empty.

like image 150
Dunes Avatar answered Sep 20 '22 14:09

Dunes


Holy CPU cycles batman!, indeed.

But please rather consider something very fundamental related to numpy; sophisticated linear algebra based functionality (like random numbers or singular value decomposition). Now, consider these seamingly simple calculations:

In []: A= rand(2560000, 3) In []: %timeit rand(2560000, 3) 1 loops, best of 3: 296 ms per loop In []: %timeit u, s, v= svd(A, full_matrices= False) 1 loops, best of 3: 571 ms per loop 

and please trust me that this kind of performance will not be beaten significantly by any package currently available.

So, please describe your real problem, and I'll try to figure out decent numpy based solution for it.

Update:
Here is some simply code for ray sphere intersection:

import numpy as np  def mag(X):     # magnitude     return (X** 2).sum(0)** .5  def closest(R, c):     # closest point on ray to center and its distance     P= np.dot(c.T, R)* R     return P, mag(P- c)  def intersect(R, P, h, r):     # intersection of rays and sphere     return P- (h* (2* r- h))** .5* R  # set up c, r= np.array([10, 10, 10])[:, None], 2. # center, radius n= 5e5 R= np.random.rand(3, n) # some random rays in first octant R= R/ mag(R) # normalized to unit length  # find rays which will intersect sphere P, b= closest(R, c) wi= b<= r  # and for those which will, find the intersection X= intersect(R[:, wi], P[:, wi], r- b[wi], r) 

Apparently we calculated correctly:

In []: allclose(mag(X- c), r) Out[]: True 

And some timings:

In []: % timeit P, b= closest(R, c) 10 loops, best of 3: 93.4 ms per loop In []: n/ 0.0934 Out[]: 5353319 #=> more than 5 million detection's of possible intersections/ s In []: %timeit X= intersect(R[:, wi], P[:, wi], r- b[wi]) 10 loops, best of 3: 32.7 ms per loop In []: X.shape[1]/ 0.0327 Out[]: 874037 #=> almost 1 million actual intersections/ s 

These timings are done with very modest machine. With modern machine, a significant speed-up can be still expected.

Anyway, this is only a short demonstration how to code with numpy.

like image 35
eat Avatar answered Sep 17 '22 14:09

eat