I just spent half an hour looking into a bug in statsmodels' SARIMAX functionality that I could finally trace back to the fact that numpy.int32 fails type checks for int.
>>> import numpy as np
>>> foo = np.int32(3)
>>> isinstance(foo, int)
False
Is there a way to circumvent this kind of issue without explicit type conversions? Should proper code even test for types and not check if a variable can safely be cast to a type?
Edit: My question is answered by an account of what technical limitations or design decisions are the cause of this behaviour and how to pythonically handle cases where both pure python's int
and numpy int32
or int64
types might appear.
Why should numpy.int32
descend from int
? int
is a specific class. It is one way of representing integers. That doesn't mean that every class that represents integers should descend from int
. numpy.int32
has different semantics and different methods - for example, it has most of the functionality needed to operate like a 0-dimensional array - and inheriting from int
isn't particularly useful for implementing numpy.int32
.
On some builds of Python 2 (Windows only?), numpy.int32
actually will descend from int
(which is also 32-bit on those builds), but I believe this design decision dates back to a time when int
performed wraparound arithmetic like numpy.int32
instead of promoting to long
on overflow, and when operator.index
didn't exist. It was a more reasonable decision back then.
As for how to treat numpy.int32
like int
, numbers.Integral
does a sort of okay job, but the implementation relies on people explicitly register
-ing their classes with numbers.Integral
, and people often don't think to do that. NumPy didn't add the register
calls until 2014, 6 years after numbers.Integral
was introduced. Similar libraries like SymPy still don't have the calls.
I find operator.index
to be a better check:
try:
real_int = operator.index(some_intlike_thing)
except TypeError:
# Not intlike.
do_something_about_that()
operator.index
is the hook an int-like class has to implement to make its instances usable as a sequence index. It's a stricter check than int(x)
, which would accept 3.5
and '3'
. Since there's a concrete, easily noticeable impact if this hook is missing, it's more likely to be present than numbers.Integral
support.
__mro__
lists the inheritance stack of a class:
np.int32.__mro__
Out[30]:
(numpy.int32,
numpy.signedinteger,
numpy.integer,
numpy.number,
numpy.generic,
object)
int.__mro__
Out[31]: (int, object)
For a basic array:
x=np.array([1,2,3])
x.dtype
Out[33]: dtype('int32')
isinstance
of classes on this stack returns True:
isinstance(x[0], np.int32)
Out[37]: True
isinstance(x[0], np.number)
Out[38]: True
int
isn't on this stack:
isinstance(x[0], int)
Out[39]: False
isinstance(x[0], object)
Out[40]: True
item
extracts a value from its numpy
wrapper:
isinstance(x[0].item(), int)
Out[41]: True
@kazemakase suggests using the numbers
module:
isinstance(x[0], numbers.Integral)
Out[47]: True
isinstance
accepts a tuple of classes, so we can handle both the int
and numpy
cases with one test:
In [259]: isinstance(x[0], (int,np.integer))
Out[259]: True
In [260]: isinstance(x[0].item(), (int,np.integer))
Out[260]: True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With