NumPy data type comparison

Question

I was playing with comparing data types of two different arrays to pick one that is suitable for combining the two. I was happy to discover that I could perform comparison operations, but in the process discovered the following strange behavior:

In [1]: numpy.int16 > numpy.float32
Out[1]: True

In [2]: numpy.dtype('int16') > numpy.dtype('float32')
Out[2]: False

Can anyone explain what is going on here? This is NumPy 1.8.2.

Alex Riley · Accepted Answer

The first comparison is not meaningful, the second is meaningful.

With numpy.int16 > numpy.float32 we are comparing two type objects:

>>> type(numpy.int16)
type
>>> numpy.int16 > numpy.float32 # I'm using Python 3
TypeError: unorderable types: type() > type()

In Python 3 this comparison fails immediately since there is no defined ordering for type instances. In Python 2, a boolean is returned but cannot be relied upon for consistency (it falls back to comparing memory addresses or other implementation-level stuff).

The second comparison does work in Python 3, and it works consistently (same in Python 2). This is because we're now comparing dtype instances:

>>> type(numpy.dtype('int16'))
numpy.dtype
>>> numpy.dtype('int16') > numpy.dtype('float32')
False
>>> numpy.dtype('int32') < numpy.dtype('|S10')
False
>>> numpy.dtype('int32') < numpy.dtype('|S11')
True

What's the logic behind this ordering?

dtype instances are ordered according to whether one can be cast (safely) to another. One type is less than another if it can be safely cast to that type.

For the implementation of the comparison operators, look at descriptor.c; specifically at the arraydescr_richcompare function.

Here's what the < operator maps to:

switch (cmp_op) {
 case Py_LT:
        if (!PyArray_EquivTypes(self, new) && PyArray_CanCastTo(self, new)) {
            result = Py_True;
        }
        else {
            result = Py_False;
        }
        break;

Essentially, NumPy just checks that the two types are (i) not equivalent, and (ii) that the first type can be cast to the second type.

This functionality is also exposed in the NumPy API as np.can_cast:

>>> np.can_cast('int32', '|S10')
False
>>> np.can_cast('int32', '|S11')
True

user2357112 supports Monica · Answer

It's nothing interesting. Python 2 tries to provide consistent but meaningless comparison results for objects that don't define how to compare themselves with each other. The developers decided that was a mistake, and in Python 3, these comparisons will raise a TypeError.

NumPy data type comparison

Tags:

python

types

numpy

farenorth

2 Answers

Alex Riley

user2357112 supports Monica

Recent Activity

Donate For Us

NumPy data type comparison

Tags:

python

types

numpy

farenorth

2 Answers

Alex Riley

user2357112 supports Monica

Related questions

Recent Activity

Donate For Us