I revrite my neural net from pure python to numpy, but now it is working even slower. So I tried this two functions:
def d():
a = [1,2,3,4,5]
b = [10,20,30,40,50]
c = [i*j for i,j in zip(a,b)]
return c
def e():
a = np.array([1,2,3,4,5])
b = np.array([10,20,30,40,50])
c = a*b
return c
timeit d = 1.77135205057
timeit e = 17.2464673758
Numpy is 10times slower. Why is it so and how to use numpy properly?
By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
NumPy is the fundamental package for scientific computing in Python. Numpy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python's built-in sequences.
NumPy uses much less memory to store data The NumPy arrays takes significantly less amount of memory as compared to python lists. It also provides a mechanism of specifying the data types of the contents, which allows further optimisation of the code. If this difference seems intimidating then prepare to have more.
I would assume that the discrepancy is because you're constructing lists and arrays in e
whereas you're only constructing lists in d
. Consider:
import numpy as np
def d():
a = [1,2,3,4,5]
b = [10,20,30,40,50]
c = [i*j for i,j in zip(a,b)]
return c
def e():
a = np.array([1,2,3,4,5])
b = np.array([10,20,30,40,50])
c = a*b
return c
#Warning: Functions with mutable default arguments are below.
# This code is only for testing and would be bad practice in production!
def f(a=[1,2,3,4,5],b=[10,20,30,40,50]):
c = [i*j for i,j in zip(a,b)]
return c
def g(a=np.array([1,2,3,4,5]),b=np.array([10,20,30,40,50])):
c = a*b
return c
import timeit
print timeit.timeit('d()','from __main__ import d')
print timeit.timeit('e()','from __main__ import e')
print timeit.timeit('f()','from __main__ import f')
print timeit.timeit('g()','from __main__ import g')
Here the functions f
and g
avoid recreating the lists/arrays each time around and we get very similar performance:
1.53083586693
15.8963699341
1.33564996719
1.69556999207
Note that list-comp + zip
still wins. However, if we make the arrays sufficiently big, numpy wins hands down:
t1 = [1,2,3,4,5] * 100
t2 = [10,20,30,40,50] * 100
t3 = np.array(t1)
t4 = np.array(t2)
print timeit.timeit('f(t1,t2)','from __main__ import f,t1,t2',number=10000)
print timeit.timeit('g(t3,t4)','from __main__ import g,t3,t4',number=10000)
My results are:
0.602419137955
0.0263929367065
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With