Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding iterable types in comparisons

Recently I ran into cosmologicon's pywats and now try to understand part about fun with iterators:

>>> a = 2, 1, 3
>>> sorted(a) == sorted(a)
True
>>> reversed(a) == reversed(a)
False

Ok, sorted(a) returns a list and sorted(a) == sorted(a) becomes just a two lists comparision. But reversed(a) returns reversed object. So why these reversed objects are different? And id's comparision makes me even more confused:

>>> id(reversed(a)) == id(reversed(a))
True
like image 394
valignatev Avatar asked Oct 12 '15 12:10

valignatev


People also ask

What are the iterable types in Python?

Examples of iterables include all sequence types (such as list , str , and tuple ) and some non-sequence types like dict , file objects, and objects of any classes you define with an __iter__() method or with a __getitem__() method that implements Sequence semantics.

What is an Iterable type?

Iterable is a pseudo-type introduced in PHP 7.1. It accepts any array or object implementing the Traversable interface. Both of these types are iterable using foreach and can be used with yield from within a generator.

What does it mean to be iterable give an example of an iterable data type?

Definition: An iterable is any Python object capable of returning its members one at a time, permitting it to be iterated over in a for-loop. Familiar examples of iterables include lists, tuples, and strings - any such sequence can be iterated over in a for-loop.

What is difference between iterable and iterator?

An Iterable is basically an object that any user can iterate over. An Iterator is also an object that helps a user in iterating over another object (that is iterable).


1 Answers

The basic reason why id(reversed(a) == id(reversed(a) returns True , whereas reversed(a) == reversed(a) returns False , can be seen from the below example using custom classes -

>>> class CA:
...     def __del__(self):
...             print('deleted', self)
...     def __init__(self):
...             print('inited', self)
...
>>> CA() == CA()
inited <__main__.CA object at 0x021B8050>
inited <__main__.CA object at 0x021B8110>
deleted <__main__.CA object at 0x021B8050>
deleted <__main__.CA object at 0x021B8110>
False
>>> id(CA()) == id(CA())
inited <__main__.CA object at 0x021B80F0>
deleted <__main__.CA object at 0x021B80F0>
inited <__main__.CA object at 0x021B80F0>
deleted <__main__.CA object at 0x021B80F0>
True

As you can see when you did customobject == customobject , the object that was created on the fly was not destroyed until after the comparison occurred, this is because that object was required for the comparison.

But in case of id(co) == id(co) , the custom object created was passed to id() function, and then only the result of id function is required for comparison , so the object that was created has no reference left, and hence the object was garbage collected, and then when the Python interpreter recreated a new object for the right side of == operation, it reused the space that was freed previously. Hence, the id for both came as same.

This above behavior is an implementation detail of CPython (it may/may not differ in other implementations of Python) . And you should never rely on the equality of ids . For example in the below case it gives the wrong result -

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> id(reversed(a)) == id(reversed(b))
True

The reason for this is again as explained above (garbage collection of the reversed object created for reversed(a) before creation of reversed object for reversed(b)).


If the lists are large, I think the most memory efficient and most probably the fastest method to compare equality for two iterators would be to use all() built-in function along with zip() function for Python 3.x (or itertools.izip() for Python 2.x).

Example for Python 3.x -

all(x==y for x,y in zip(aiterator,biterator))

Example for Python 2.x -

from itertools import izip
all(x==y for x,y in izip(aiterator,biterator))

This is because all() short circuits at the first False value is encounters, and `zip() in Python 3.x returns an iterator which yields out the corresponding elements from both the different iterators. This does not need to create a separate list in memory.

Demo -

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> all(x==y for x,y in zip(reversed(a),reversed(b)))
False
>>> all(x==y for x,y in zip(reversed(a),reversed(a)))
True
like image 157
Anand S Kumar Avatar answered Oct 21 '22 09:10

Anand S Kumar