I'm learning numpy. But I got some questions confused me:
>>> import numpy as np
>>> a = np.arange(10)
>>> a.sum()
45
and:sum(a)
give the same result.
So why a built-in function can support the calculation of a data type from a third-party library? min() and max() do the same.(When the dim is 1)
I got two guesses about this, I prefer the latter:
All a third-party library type has to do, is implement the expected protocol (sometimes also called an interface). The sum()
function documentation tells you what is expected:
Sums start and the items of an iterable from left to right and returns the total.
min()
and max()
state similar requirements (Return the smallest item in an iterable, Return the largest item in an iterable).
Here, iterable is a protocol, described in the standard types documentation. Protocols are not themselves types, they are just a collection of methods that are expected to behave in a certain way. The collections.abc
module provides several objects you can use to test if something implements a protocol:
>>> import numpy as np
>>> from collections.abc import Iterable
>>> a = np.arange(10)
>>> isinstance(a, Iterable)
True
So the ndarray
type is an iterable, and that's what the sum()
function uses to get all the values contained in the array, summing those values for you.
Because Python relies on protocols, the core language developers don't have to add support for every third-party library out there. Instead, the libraries simply match the expectations of the core language.
Note that the ndarray.sum()
implementation can make use of the internal implementation of the type; it probably can produce the sum faster, as it doesn't have to convert the internal data to Python objects first (iteration returns boxed types, Python int
objects in this case, while the internal representation contains bare C integers).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With