I was playing with comparing data types of two different arrays to pick one that is suitable for combining the two. I was happy to discover that I could perform comparison operations, but in the process discovered the following strange behavior:
In [1]: numpy.int16 > numpy.float32
Out[1]: True
In [2]: numpy.dtype('int16') > numpy.dtype('float32')
Out[2]: False
Can anyone explain what is going on here? This is NumPy 1.8.2.
The first comparison is not meaningful, the second is meaningful.
With numpy.int16 > numpy.float32
we are comparing two type
objects:
>>> type(numpy.int16)
type
>>> numpy.int16 > numpy.float32 # I'm using Python 3
TypeError: unorderable types: type() > type()
In Python 3 this comparison fails immediately since there is no defined ordering for type
instances. In Python 2, a boolean is returned but cannot be relied upon for consistency (it falls back to comparing memory addresses or other implementation-level stuff).
The second comparison does work in Python 3, and it works consistently (same in Python 2). This is because we're now comparing dtype
instances:
>>> type(numpy.dtype('int16'))
numpy.dtype
>>> numpy.dtype('int16') > numpy.dtype('float32')
False
>>> numpy.dtype('int32') < numpy.dtype('|S10')
False
>>> numpy.dtype('int32') < numpy.dtype('|S11')
True
What's the logic behind this ordering?
dtype
instances are ordered according to whether one can be cast (safely) to another. One type is less than another if it can be safely cast to that type.
For the implementation of the comparison operators, look at descriptor.c; specifically at the arraydescr_richcompare
function.
Here's what the <
operator maps to:
switch (cmp_op) {
case Py_LT:
if (!PyArray_EquivTypes(self, new) && PyArray_CanCastTo(self, new)) {
result = Py_True;
}
else {
result = Py_False;
}
break;
Essentially, NumPy just checks that the two types are (i) not equivalent, and (ii) that the first type can be cast to the second type.
This functionality is also exposed in the NumPy API as np.can_cast
:
>>> np.can_cast('int32', '|S10')
False
>>> np.can_cast('int32', '|S11')
True
It's nothing interesting. Python 2 tries to provide consistent but meaningless comparison results for objects that don't define how to compare themselves with each other. The developers decided that was a mistake, and in Python 3, these comparisons will raise a TypeError
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With