I am baffled by this
def main(): for i in xrange(2560000): a = [0.0, 0.0, 0.0] main() $ time python test.py real 0m0.793s
Let's now see with numpy:
import numpy def main(): for i in xrange(2560000): a = numpy.array([0.0, 0.0, 0.0]) main() $ time python test.py real 0m39.338s
Holy CPU cycles batman!
Using numpy.zeros(3)
improves, but still not enough IMHO
$ time python test.py real 0m5.610s user 0m5.449s sys 0m0.070s
numpy.version.version = '1.5.1'
If you are wondering if the list creation is skipped for optimization in the first example, it is not:
5 19 LOAD_CONST 2 (0.0) 22 LOAD_CONST 2 (0.0) 25 LOAD_CONST 2 (0.0) 28 BUILD_LIST 3 31 STORE_FAST 1 (a)
NumPy random for generating an array of random numbers ndarray of 1000 random numbers. The reason why NumPy is fast when used right is that its arrays are extremely efficient. They are like C arrays instead of Python lists.
By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.
NumPy Arrays Are Faster Than Lists The array is randomly generated. As predicted, we can see that NumPy arrays are significantly faster than lists.
As you can see for small operations, NumPy performs better and as the size increases, tf-numpy provides better performance. And the performance on GPU is way better than its CPU counterpart.
Numpy is optimised for large amounts of data. Give it a tiny 3 length array and, unsurprisingly, it performs poorly.
Consider a separate test
import timeit reps = 100 pythonTest = timeit.Timer('a = [0.] * 1000000') numpyTest = timeit.Timer('a = numpy.zeros(1000000)', setup='import numpy') uninitialised = timeit.Timer('a = numpy.empty(1000000)', setup='import numpy') # empty simply allocates the memory. Thus the initial contents of the array # is random noise print 'python list:', pythonTest.timeit(reps), 'seconds' print 'numpy array:', numpyTest.timeit(reps), 'seconds' print 'uninitialised array:', uninitialised.timeit(reps), 'seconds'
And the output is
python list: 1.22042918205 seconds numpy array: 1.05412316322 seconds uninitialised array: 0.0016028881073 seconds
It would seem that it is the zeroing of the array that is taking all the time for numpy. So unless you need the array to be initialised then try using empty.
Holy CPU cycles batman!
, indeed.
But please rather consider something very fundamental related to numpy
; sophisticated linear algebra based functionality (like random numbers
or singular value decomposition
). Now, consider these seamingly simple calculations:
In []: A= rand(2560000, 3) In []: %timeit rand(2560000, 3) 1 loops, best of 3: 296 ms per loop In []: %timeit u, s, v= svd(A, full_matrices= False) 1 loops, best of 3: 571 ms per loop
and please trust me that this kind of performance will not be beaten significantly by any package currently available.
So, please describe your real problem, and I'll try to figure out decent numpy
based solution for it.
Update:
Here is some simply code for ray sphere intersection:
import numpy as np def mag(X): # magnitude return (X** 2).sum(0)** .5 def closest(R, c): # closest point on ray to center and its distance P= np.dot(c.T, R)* R return P, mag(P- c) def intersect(R, P, h, r): # intersection of rays and sphere return P- (h* (2* r- h))** .5* R # set up c, r= np.array([10, 10, 10])[:, None], 2. # center, radius n= 5e5 R= np.random.rand(3, n) # some random rays in first octant R= R/ mag(R) # normalized to unit length # find rays which will intersect sphere P, b= closest(R, c) wi= b<= r # and for those which will, find the intersection X= intersect(R[:, wi], P[:, wi], r- b[wi], r)
Apparently we calculated correctly:
In []: allclose(mag(X- c), r) Out[]: True
And some timings:
In []: % timeit P, b= closest(R, c) 10 loops, best of 3: 93.4 ms per loop In []: n/ 0.0934 Out[]: 5353319 #=> more than 5 million detection's of possible intersections/ s In []: %timeit X= intersect(R[:, wi], P[:, wi], r- b[wi]) 10 loops, best of 3: 32.7 ms per loop In []: X.shape[1]/ 0.0327 Out[]: 874037 #=> almost 1 million actual intersections/ s
These timings are done with very modest machine. With modern machine, a significant speed-up can be still expected.
Anyway, this is only a short demonstration how to code with numpy
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With