Mostly curious.
I've noticed (at least in py 2.6 and 2.7) that a float
has all the familiar rich comparison functions: __lt__()
, __gt__
, __eq__
, etc.
>>> (5.0).__gt__(4.5) True
but an int
does not
>>> (5).__gt__(4) Traceback (most recent call last): File "<input>", line 1, in <module> AttributeError: 'int' object has no attribute '__gt__'
Which is odd to me, because the operator itself works fine
>>> 5 > 4 True
Even strings support the comparison functions
>>> "hat".__gt__("ace") True
but all the int
has is __cmp__()
Seems strange to me, and so I was wondering why this came to be.
Just tested and it works as expected in python 3, so I am assuming some legacy reasons. Still would like to hear a proper explanation though ;)
For rich comparisons, it is called with a third argument, one of “<”, “<=”, “>”, “>=”, “==”, “!= ”, “<>” (the last two have the same meaning). When called with one of these strings as the third argument, cmp() can return any Python object. Otherwise, it can only return -1, 0 or 1 as before.
We have six of these, including and limited to- less than, greater than, less than or equal to, greater than or equal to, equal to, and not equal to. So, let's begin with the Python Comparison operators.
Well, to write greater than or equal to in Python, you need to use the >= comparison operator. It will return a Boolean value – either True or False. The "greater than or equal to" operator is known as a comparison operator. These operators compare numbers or strings and return a value of either True or False .
If we look at the PEP 207 for Rich Comparisions there is this interesting sentence right at the end:
The inlining already present which deals with integer comparisons would still apply, resulting in no performance cost for the most common cases.
So it seems that in 2.x there is an optimisation for integer comparison. If we take a look at the source code we can find this:
case COMPARE_OP: w = POP(); v = TOP(); if (PyInt_CheckExact(w) && PyInt_CheckExact(v)) { /* INLINE: cmp(int, int) */ register long a, b; register int res; a = PyInt_AS_LONG(v); b = PyInt_AS_LONG(w); switch (oparg) { case PyCmp_LT: res = a < b; break; case PyCmp_LE: res = a <= b; break; case PyCmp_EQ: res = a == b; break; case PyCmp_NE: res = a != b; break; case PyCmp_GT: res = a > b; break; case PyCmp_GE: res = a >= b; break; case PyCmp_IS: res = v == w; break; case PyCmp_IS_NOT: res = v != w; break; default: goto slow_compare; } x = res ? Py_True : Py_False; Py_INCREF(x); } else { slow_compare: x = cmp_outcome(oparg, v, w); }
So it seems that in 2.x there was an existing performance optimisation - by allowing the C code to compare integers directly - which would not have been preserved if the rich comparison operators had been implemented.
Now in Python 3 __cmp__
is no longer supported so the rich comparison operators must there. Now this does not cause a performance hit as far as I can tell. For example, compare:
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) [GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import timeit >>> timeit.timeit("2 < 1") 0.06980299949645996
to:
Python 3.2.3 (v3.2.3:3d0686d90f55, Apr 10 2012, 11:25:50) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import timeit >>> timeit.timeit("2 < 1") 0.06682920455932617
So it seems that similar optimisations are there but my guess is the judgement call was that putting them all in the 2.x branch would have been too great a change when backwards compatibility was a consideration.
In 2.x if you want something like the rich comparison methods you can get at them via the operator
module:
>>> import operator >>> operator.gt(2,1) True
__cmp__()
is the old-fashioned way of doing comparisons, and is deprecated in favor of the rich operators (__lt__
, __le__
etc.) which were only introduced in Python 2.1. Likely the transition was not complete as of 2.7.x -- whereas in Python 3.x __cmp__
is completely removed.
Haskell has the most elegant implementation I've seen -- to be an Ord
(ordinal) data type, you just need to define how <
and =
works, and the typeclass itself supplies default implementations for <=
, >
and >=
in terms of those two (which you're more than welcome to define yourself if you want). You can write such a class yourself in Python, not sure why that's not the default; probably performance reasons.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With