sorry for so many questions. I am running Mac OSX 10.6 on Intel core 2 Duo. I am running some benchmarks for my research and I have run into another thing that baffles me.
If I run
python -mtimeit -s 'import numpy as np; a = np.random.randn(1e3,1e3)' 'np.dot(a,a)'
I get the following output: 10 loops, best of 3: 142 msec per loop
However, if I run
python -mtimeit -s 'import numpy as np; a = np.random.randint(10,size=1e6).reshape(1e3,1e3)' 'np.dot(a,a)'
I get the following output: 10 loops, best of 3: 7.57 sec per loop
Then I ran
python -mtimeit -s 'import numpy as np; a = np.random.randn(1e3,1e3)' 'a*a'
And then
python -mtimeit -s 'import numpy as np; a = np.random.randint(10,size=1e6).reshape(1e3,1e3)' 'a*a'
Both ran at about 7.6 msec per loop so it is not the multiplication. Adding had similar speeds as well, so neither of these should be affecting the dot-product, right? So why is it over 50 times slower to calculate the dot product using ints than using floats?
matmul and both outperform np. dot . Also note, as explained in the docs, np.
However, as we said before, it is recommended to use np. dot for dot product and np. matmul for 2D or higher matrix multiplication.
Vectorized implementations (numpy) are much faster and more efficient as compared to for-loops. To really see HOW large the difference is, let's try some simple operations used in most machine learnign algorithms (especially deep learning).
very interesting, I was curious to see how it was implemented so I did:
>>> import inspect
>>> import numpy as np
>>> inspect.getmodule(np.dot)
<module 'numpy.core._dotblas' from '/Library/Python/2.6/site-packages/numpy-1.6.1-py2.6-macosx-10.6-universal.egg/numpy/core/_dotblas.so'>
>>>
So it looks like its using the BLAS
library.
so:
>>> help(np.core._dotblas)
from which I found this:
When Numpy is built with an accelerated BLAS like ATLAS, these functions are replaced to make use of the faster implementations. The faster implementations only affect float32, float64, complex64, and complex128 arrays. Furthermore, the BLAS API only includes matrix-matrix, matrix-vector, and vector-vector products. Products of arrays with larger dimensionalities use the built in functions and are not accelerated.
So it looks like ATLAS fine tunes certain functions but its only applicable to certain data types, very interesting.
so yeah it looks I'll be using floats more often ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With