Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

itertools does not recognize numpy ints as valid inputs on Python 3.6

Take this code:

import itertools as it
import numpy as np
data = ['a','b','c','d']
dw = np.array([1, 3], dtype=np.int64)
print(list(it.islice(data,dw[0],dw[1],1)))

On Python 2.7 it prints ['b', 'c',] as expected.

On Python 3.6 it throws an exception:

ValueError: Stop argument for islice() must be None or an integer: 0 <= x <= sys.maxsize.

The same goes for np.int32, and other methods of the itertools package throw similar errors, e.g. when you use permutations you get TypeError: Expected int as r.

I couldn't find much on this apart from this numpy issue and related ones, but that one was closed 3 years ago implying it was solved.

And basic things like indexing with numpy ints data[dw[0]] or boolean comparisons like dw[0] == 1 work just fine.

Am I missing something? Could this be a Python 3 bug?

like image 337
Khris Avatar asked Jun 01 '17 08:06

Khris


2 Answers

a numpy.int64 is apparently not a subclass of int

a, b = dw[0], dw[1]

type(a)

numpy.int64

isinstance(a, int)

False

Numpy documentation

The documentation mentions this explicitly

Warning

The int_ type does not inherit from the int built-in under Python 3, because type int is no longer a fixed-width integer type.

Solution

print(list(it.islice(data, int(dw[0]) , int(dw[1]), 1)))

or numpy slicing

data[dw[0]:dw[1]:1]
like image 79
Maarten Fabré Avatar answered Sep 28 '22 07:09

Maarten Fabré


I'm not sure if it's a bug in Python 3 or not, but it looks like the behaviour has changed since 2.7. As the numpy issue you linked described, under py27, either numpy.int32 or numpy.int64 would appear to be a subclass of int (depending on whether you use a 32- or 64-bit build of Python); under py3, the types are no longer related (numpy has fixed-width numeric types, python's int is variable-width).

The implementation of itertools.islice requires its arguments to be objects of type PyLong (which is the Python API name for the Python int type). Specifically, it calls PyLong_AsSize_t, which converts a Python object into a C size_t value. This method seems to require that its argument is actually a Python int object, since it calls PyLong_Check. I think this method is broadly equivalent to Python's isinstance(obj, int), which explains the difference in behaviour between py2 and py3 here.

Normal list indexing uses another more tolerant method to coerce arguments into positive integer values, called PyNumber_AsSsize_t. This checks if its argument is an int, and, if not, falls back to trying to call its argument's __index__ method; as @MarkDickinson points out, numpy's numeric types implement this method, so everything works fine. Perhaps this would be a more intuitive thing for itertools.islice to do.

like image 35
wildwilhelm Avatar answered Sep 28 '22 08:09

wildwilhelm