I was surprised that calling np.inner
to compute a sum of squares was about 5x faster than calling np.sum
on a pre-computed array of squares:
Any insights into this behavior? I'm actually interested in a very fast implementation of a sum of squares, so those thoughts are welcome, too.
Pythons sum iterates over the iterable (in this case the list or array) and adds all elements. NumPys sum method iterates over the stored C array and adds these C values and finally wraps that value in a Python type (in this case numpy. int32 (or numpy. int64 ) and returns it.
Essentially, the NumPy sum function sums up the elements of an array. It just takes the elements within a NumPy array (an ndarray object) and adds them together. Having said that, it can get a little more complicated. It's possible to also add up the rows or add up the columns of an array.
sum receives an array of booleans as its argument, it'll sum each element (count True as 1 and False as 0) and return the outcome. for instance np. sum([True, True, False]) will output 2 :) Hope this helps.
To check in which modules np.inner
and np.sum
are implemented I type
>>> np.inner.__module__
'numpy.core.multiarray'
>>> np.sum.__module__
'numpy.core.fromnumeric'
>>> np.__file__
'/Users/uweschmitt/venv_so/lib/python3.5/site-packages/numpy/__init__.py'
If you inspect the actual files, you can see that numpy.core.multiarray
is a pure C module whereas numpy.core.fromnumeric
first does some checks and conversions in Python before a second Python function and then a pure C implementation for the actual summation is called.
I suspect that this overhead from the Python interpreter explains the observed timing differences.
To prove my assumption I run the timing with a larger array and get
In [8]: a = np.random.random(1000000)
In [9]: %timeit np.inner(a, a)
1000 loops, best of 3: 673 µs per loop
In [10]: %timeit np.sum(a)
1000 loops, best of 3: 584 µs per loop
Now run times are quite similar and change a little if you repeat the statements, sometimes np.sum
wins, somtimes np.inner
.
For the big array the actual work of np.sum
is done in C and the constant time overhead from the Python interpreter is negligible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With