Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why python bulit-in functions such as sum(),max(),min() can be used to calculate the numpy's datatype ndarray?

Tags:

python

numpy

I'm learning numpy. But I got some questions confused me:

>>> import numpy as np
>>> a = np.arange(10)
>>> a.sum()
45

and:sum(a) give the same result. So why a built-in function can support the calculation of a data type from a third-party library? min() and max() do the same.(When the dim is 1)

I got two guesses about this, I prefer the latter:

  1. python core developer add the support for ndarray;
  2. some hidden attributes define in ndarray make that happen.(If so, what is it?)
like image 271
Feishi Avatar asked Nov 29 '16 09:11

Feishi


1 Answers

All a third-party library type has to do, is implement the expected protocol (sometimes also called an interface). The sum() function documentation tells you what is expected:

Sums start and the items of an iterable from left to right and returns the total.

min() and max() state similar requirements (Return the smallest item in an iterable, Return the largest item in an iterable).

Here, iterable is a protocol, described in the standard types documentation. Protocols are not themselves types, they are just a collection of methods that are expected to behave in a certain way. The collections.abc module provides several objects you can use to test if something implements a protocol:

>>> import numpy as np
>>> from collections.abc import Iterable
>>> a = np.arange(10)
>>> isinstance(a, Iterable)
True

So the ndarray type is an iterable, and that's what the sum() function uses to get all the values contained in the array, summing those values for you.

Because Python relies on protocols, the core language developers don't have to add support for every third-party library out there. Instead, the libraries simply match the expectations of the core language.

Note that the ndarray.sum() implementation can make use of the internal implementation of the type; it probably can produce the sum faster, as it doesn't have to convert the internal data to Python objects first (iteration returns boxed types, Python int objects in this case, while the internal representation contains bare C integers).

like image 156
Martijn Pieters Avatar answered Oct 23 '22 10:10

Martijn Pieters