Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NumPy ndarray.all() vs np.all(ndarray) vs all(ndarray)

What is the the difference between the three "all" methods in Python/NumPy? What is the reason for the performance difference? Is it true that ndarray.all() is always the fastest of the three?

Here is a timing test that I ran:

In [59]: a = np.full(100000, True, dtype=bool)

In [60]: timeit a.all()
The slowest run took 5.40 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.24 µs per loop

In [61]: timeit all(a)
1000 loops, best of 3: 1.34 ms per loop

In [62]: timeit np.all(a)
The slowest run took 5.54 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.41 µs per loop
like image 918
dkv Avatar asked Apr 13 '17 01:04

dkv


People also ask

What is the difference between Ndarray and NP array?

numpy. array is just a convenience function to create an ndarray ; it is not a class itself. You can also create an array using numpy. ndarray , but it is not the recommended way.

What does all () NumPy?

all() in Python. The numpy. all() function tests whether all array elements along the mentioned axis evaluate to True.

What is a NP Ndarray?

An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size. The number of dimensions and items in an array is defined by its shape , which is a tuple of N non-negative integers that specify the sizes of each dimension.

Is a NumPy Ndarray is faster than a built in list?

NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.


2 Answers

The difference between np.all(a) and a.all() is simple:

  • If a is a numpy.array then np.all() will simply call a.all().
  • If a is not a numpy.array the np.all() call will convert it to an numpy.array and then call a.all(). a.all() on the other hand will fail because a wasn't a numpy.array and therefore probably has no all method.

The difference between np.all and all is more complicated.

  • The all function works on any iterable (including list, sets, generators, ...). np.all works only for numpy.arrays (including everything that can be converted to a numpy array, i.e. lists and tuples).
  • np.all processes an array with specified data type, that makes it pretty efficient when comparing for != 0. all however needs to evaluate bool for each item, that's much slower.
  • processing arrays with python functions is pretty slow because each item in the array needs to be converted to a python object. np.all doesn't need to do that conversion.

Note that the timings depend also on the type of your a. If you process a python list all can be faster for relativly short lists. If you process an array, np.all and a.all() will be faster in almost all cases (except maybe for object arrays, but I won't go down that path, that way lies madness).

like image 96
MSeifert Avatar answered Oct 13 '22 01:10

MSeifert


I'll take a swing at this

  • np.all is a generic function which will work with different data types, under the hood this probably looks for ndarray.all which is why it's slightly slower.

  • all is a python bulit-in function see https://docs.python.org/2/library/functions.html#all.

  • ndarray.all is method of the 'numpy.ndarray' object, calling this directly may be faster.

like image 39
pyCthon Avatar answered Oct 12 '22 23:10

pyCthon