Recently I answered to THIS question which wanted the multiplication of 2 lists,some user suggested the following way using numpy, alongside mine which I think is the proper way :
(a.T*b).T
Also I found that aray.resize()
has a same performance like that. any way another answer suggested a solution using list comprehension :
[[m*n for n in second] for m, second in zip(b,a)]
But after the benchmark I saw that the list comprehension performs very faster than numpy :
from timeit import timeit
s1="""
a=[[2,3,5],[3,6,2],[1,3,2]]
b=[4,2,1]
[[m*n for n in second] for m, second in zip(b,a)]
"""
s2="""
a=np.array([[2,3,5],[3,6,2],[1,3,2]])
b=np.array([4,2,1])
(a.T*b).T
"""
print ' first: ' ,timeit(stmt=s1, number=1000000)
print 'second : ',timeit(stmt=s2, number=1000000,setup="import numpy as np")
result :
first: 1.49778485298
second : 7.43547797203
As you can see numpy is approximately 5 time faster. but most surprising thing was that its faster without using transpose, and for following code :
a=np.array([[2,3,5],[3,6,2],[1,3,2]])
b=np.array([[4],[2],[1]])
a*b
The list comprehension still was 5 time faster.So besides of this point that list comprehensions performs in C here we used 2 nested loop and a zip
function So what can be the reason? Is it because of operation *
in numpy?
Also note that there is no problem with timeit
here I putted the import
part in setup
.
I also tried it with larger arras, the difference gets lower but still doesn't make sense :
s1="""
a=[[2,3,5],[3,6,2],[1,3,2]]*10000
b=[4,2,1]*10000
[[m*n for n in second] for m, second in zip(b,a)]
"""
s2="""
a=np.array([[2,3,5],[3,6,2],[1,3,2]]*10000)
b=np.array([4,2,1]*10000)
(a.T*b).T
"""
print ' first: ' ,timeit(stmt=s1, number=1000)
print 'second : ',timeit(stmt=s2, number=1000,setup="import numpy as np")
result :
first: 10.7480301857
second : 13.1278889179
Creation of numpy arrays is much slower than creation of lists:
In [153]: %timeit a = [[2,3,5],[3,6,2],[1,3,2]]
1000000 loops, best of 3: 308 ns per loop
In [154]: %timeit a = np.array([[2,3,5],[3,6,2],[1,3,2]])
100000 loops, best of 3: 2.27 µs per loop
There can also fixed costs incurred by NumPy function calls before the meat of the calculation can be performed by a fast underlying C/Fortran function. This can include ensuring the inputs are NumPy arrays,
These setup/fixed costs are something to keep in mind before assuming NumPy solutions are inherently faster than pure-Python solutions. NumPy shines when you set up large arrays once and then perform many fast NumPy operations on the arrays. It may fail to outperform pure Python if the arrays are small because the setup cost can outweigh the benefit of offloading the calculations to compiled C/Fortran functions. For small arrays there simply may not be enough calculations to make it worth it.
If you increase the size of the arrays a bit, and move creation of the arrays into the setup, then NumPy can be much faster than pure Python:
import numpy as np
from timeit import timeit
N, M = 300, 300
a = np.random.randint(100, size=(N,M))
b = np.random.randint(100, size=(N,))
a2 = a.tolist()
b2 = b.tolist()
s1="""
[[m*n for n in second] for m, second in zip(b2,a2)]
"""
s2 = """
(a.T*b).T
"""
s3 = """
a*b[:,None]
"""
assert np.allclose([[m*n for n in second] for m, second in zip(b2,a2)], (a.T*b).T)
assert np.allclose([[m*n for n in second] for m, second in zip(b2,a2)], a*b[:,None])
print 's1: {:.4f}'.format(
timeit(stmt=s1, number=10**3, setup='from __main__ import a2,b2'))
print 's2: {:.4f}'.format(
timeit(stmt=s2, number=10**3, setup='from __main__ import a,b'))
print 's3: {:.4f}'.format(
timeit(stmt=s3, number=10**3, setup='from __main__ import a,b'))
yields
s1: 4.6990
s2: 0.1224
s3: 0.1234
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With