Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is numpy.int32 not recognized as an int type

I just spent half an hour looking into a bug in statsmodels' SARIMAX functionality that I could finally trace back to the fact that numpy.int32 fails type checks for int.

>>> import numpy as np
>>> foo = np.int32(3)
>>> isinstance(foo, int)
False

Is there a way to circumvent this kind of issue without explicit type conversions? Should proper code even test for types and not check if a variable can safely be cast to a type?

Edit: My question is answered by an account of what technical limitations or design decisions are the cause of this behaviour and how to pythonically handle cases where both pure python's int and numpy int32 or int64 types might appear.

like image 239
Neuneck Avatar asked Jan 26 '18 09:01

Neuneck


2 Answers

Why should numpy.int32 descend from int? int is a specific class. It is one way of representing integers. That doesn't mean that every class that represents integers should descend from int. numpy.int32 has different semantics and different methods - for example, it has most of the functionality needed to operate like a 0-dimensional array - and inheriting from int isn't particularly useful for implementing numpy.int32.

On some builds of Python 2 (Windows only?), numpy.int32 actually will descend from int (which is also 32-bit on those builds), but I believe this design decision dates back to a time when int performed wraparound arithmetic like numpy.int32 instead of promoting to long on overflow, and when operator.index didn't exist. It was a more reasonable decision back then.

As for how to treat numpy.int32 like int, numbers.Integral does a sort of okay job, but the implementation relies on people explicitly register-ing their classes with numbers.Integral, and people often don't think to do that. NumPy didn't add the register calls until 2014, 6 years after numbers.Integral was introduced. Similar libraries like SymPy still don't have the calls.

I find operator.index to be a better check:

try:
    real_int = operator.index(some_intlike_thing)
except TypeError:
    # Not intlike.
    do_something_about_that()

operator.index is the hook an int-like class has to implement to make its instances usable as a sequence index. It's a stricter check than int(x), which would accept 3.5 and '3'. Since there's a concrete, easily noticeable impact if this hook is missing, it's more likely to be present than numbers.Integral support.

like image 153
user2357112 supports Monica Avatar answered Sep 26 '22 01:09

user2357112 supports Monica


__mro__ lists the inheritance stack of a class:

np.int32.__mro__
Out[30]: 
(numpy.int32,
 numpy.signedinteger,
 numpy.integer,
 numpy.number,
 numpy.generic,
 object)

int.__mro__
Out[31]: (int, object)

For a basic array:

x=np.array([1,2,3])    
x.dtype
Out[33]: dtype('int32')

isinstance of classes on this stack returns True:

isinstance(x[0], np.int32)
Out[37]: True    
isinstance(x[0], np.number)
Out[38]: True    

int isn't on this stack:

isinstance(x[0], int)
Out[39]: False    
isinstance(x[0], object)
Out[40]: True

item extracts a value from its numpy wrapper:

isinstance(x[0].item(), int)
Out[41]: True

@kazemakase suggests using the numbers module:

isinstance(x[0], numbers.Integral)
Out[47]: True

edit

isinstance accepts a tuple of classes, so we can handle both the int and numpy cases with one test:

In [259]: isinstance(x[0], (int,np.integer))                                                           
Out[259]: True
In [260]: isinstance(x[0].item(), (int,np.integer))                                                    
Out[260]: True
like image 35
hpaulj Avatar answered Sep 26 '22 01:09

hpaulj