Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Numpy dot product very slow using ints

sorry for so many questions. I am running Mac OSX 10.6 on Intel core 2 Duo. I am running some benchmarks for my research and I have run into another thing that baffles me.

If I run

python -mtimeit -s 'import numpy as np; a = np.random.randn(1e3,1e3)' 'np.dot(a,a)'

I get the following output: 10 loops, best of 3: 142 msec per loop

However, if I run

python -mtimeit -s 'import numpy as np; a = np.random.randint(10,size=1e6).reshape(1e3,1e3)' 'np.dot(a,a)'

I get the following output: 10 loops, best of 3: 7.57 sec per loop

Then I ran

python -mtimeit -s 'import numpy as np; a = np.random.randn(1e3,1e3)' 'a*a' And then

python -mtimeit -s 'import numpy as np; a = np.random.randint(10,size=1e6).reshape(1e3,1e3)' 'a*a'

Both ran at about 7.6 msec per loop so it is not the multiplication. Adding had similar speeds as well, so neither of these should be affecting the dot-product, right? So why is it over 50 times slower to calculate the dot product using ints than using floats?

like image 483
Nino Avatar asked Aug 08 '12 01:08


People also ask

Is Matmul faster than dot?

matmul and both outperform np. dot . Also note, as explained in the docs, np.

Should I use Matmul or dot?

However, as we said before, it is recommended to use np. dot for dot product and np. matmul for 2D or higher matrix multiplication.

Is NP dot faster than for loops?

Vectorized implementations (numpy) are much faster and more efficient as compared to for-loops. To really see HOW large the difference is, let's try some simple operations used in most machine learnign algorithms (especially deep learning).

1 Answers

very interesting, I was curious to see how it was implemented so I did:

>>> import inspect
>>> import numpy as np
>>> inspect.getmodule(np.dot)
<module 'numpy.core._dotblas' from '/Library/Python/2.6/site-packages/numpy-1.6.1-py2.6-macosx-10.6-universal.egg/numpy/core/_dotblas.so'>

So it looks like its using the BLAS library.


>>> help(np.core._dotblas)

from which I found this:

When Numpy is built with an accelerated BLAS like ATLAS, these functions are replaced to make use of the faster implementations. The faster implementations only affect float32, float64, complex64, and complex128 arrays. Furthermore, the BLAS API only includes matrix-matrix, matrix-vector, and vector-vector products. Products of arrays with larger dimensionalities use the built in functions and are not accelerated.

So it looks like ATLAS fine tunes certain functions but its only applicable to certain data types, very interesting.

so yeah it looks I'll be using floats more often ...

like image 130
Samy Vilar Avatar answered Sep 17 '22 15:09

Samy Vilar