Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why numpy.sum returns a float64 instead of an uint64 when adding elements of a generator?

Tags:

python

numpy

I just came across this strange behaviour of numpy.sum:

>>> import numpy
>>> ar = numpy.array([1,2,3], dtype=numpy.uint64)
>>> gen = (el for el in ar)
>>> lst = [el for el in ar]
>>> numpy.sum(gen)
6.0
>>> numpy.sum(lst)
6
>>> numpy.sum(iter(lst))
<listiterator object at 0x87d02cc>

According to the documentation the result should be of the same dtype of the iterable, but then why in the first case a numpy.float64 is returned instead of an numpy.uint64? And how come the last example does not return any kind of sum and does not raise any error either?

like image 515
Bakuriu Avatar asked Dec 21 '22 12:12

Bakuriu


1 Answers

In general, numpy functions don't always do what you might expect when working with generators. To create a numpy array, you need to know its size and type before creating it, and this isn't possible for generators. So many numpy functions either don't work with generators, or do this sort of thing where they fall back on Python builtins.

However, for the same reason, using generators often isn't that useful in Numpy contexts. There's no real advantage to making a generator from a Numpy object, because you already have to have the entire Numpy object in memory anyway. If you need all the types to stay as you specify, you should just not wrap your Numpy objects in generators.

Some more info: Technically, the argument to np.sum is supposed to be an "array-like" object, not an iterable. Array-like is defined in the documentation as:

An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.

The array interface is documented here. Basically, arrays have to have a fixed shape and a uniform type.

Generators don't fit this protocol and so aren't really supported. Many numpy functions are nice and will accept other sorts of objects that don't technically qualify as array-like, but a strict reading of the docs implies you can't rely on this behavior. The operations may work, but you can't expect all the types to be preserved perfectly.

like image 197
BrenBarn Avatar answered May 19 '23 12:05

BrenBarn