Understanding iterable types in comparisons

Tags:

iterator

Recently I ran into cosmologicon's pywats and now try to understand part about fun with iterators:

>>> a = 2, 1, 3
>>> sorted(a) == sorted(a)
True
>>> reversed(a) == reversed(a)
False

Ok, sorted(a) returns a list and sorted(a) == sorted(a) becomes just a two lists comparision. But reversed(a) returns reversed object. So why these reversed objects are different? And id's comparision makes me even more confused:

>>> id(reversed(a)) == id(reversed(a))
True

394

asked Oct 12 '15 12:10

valignatev

1 Answers

The basic reason why id(reversed(a) == id(reversed(a) returns True , whereas reversed(a) == reversed(a) returns False , can be seen from the below example using custom classes -

>>> class CA:
...     def __del__(self):
...             print('deleted', self)
...     def __init__(self):
...             print('inited', self)
...
>>> CA() == CA()
inited <__main__.CA object at 0x021B8050>
inited <__main__.CA object at 0x021B8110>
deleted <__main__.CA object at 0x021B8050>
deleted <__main__.CA object at 0x021B8110>
False
>>> id(CA()) == id(CA())
inited <__main__.CA object at 0x021B80F0>
deleted <__main__.CA object at 0x021B80F0>
inited <__main__.CA object at 0x021B80F0>
deleted <__main__.CA object at 0x021B80F0>
True

As you can see when you did customobject == customobject , the object that was created on the fly was not destroyed until after the comparison occurred, this is because that object was required for the comparison.

But in case of id(co) == id(co) , the custom object created was passed to id() function, and then only the result of id function is required for comparison , so the object that was created has no reference left, and hence the object was garbage collected, and then when the Python interpreter recreated a new object for the right side of == operation, it reused the space that was freed previously. Hence, the id for both came as same.

This above behavior is an implementation detail of CPython (it may/may not differ in other implementations of Python) . And you should never rely on the equality of ids . For example in the below case it gives the wrong result -

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> id(reversed(a)) == id(reversed(b))
True

The reason for this is again as explained above (garbage collection of the reversed object created for reversed(a) before creation of reversed object for reversed(b)).

If the lists are large, I think the most memory efficient and most probably the fastest method to compare equality for two iterators would be to use all() built-in function along with zip() function for Python 3.x (or itertools.izip() for Python 2.x).

Example for Python 3.x -

all(x==y for x,y in zip(aiterator,biterator))

Example for Python 2.x -

from itertools import izip
all(x==y for x,y in izip(aiterator,biterator))

This is because all() short circuits at the first False value is encounters, and `zip() in Python 3.x returns an iterator which yields out the corresponding elements from both the different iterators. This does not need to create a separate list in memory.

Demo -

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> all(x==y for x,y in zip(reversed(a),reversed(b)))
False
>>> all(x==y for x,y in zip(reversed(a),reversed(a)))
True

157

answered Oct 21 '22 09:10

Anand S Kumar

Related questions
                            
                                How do I make make spiral in python? [closed]
                            
                                skewing or shearing an image in python
                            
                                Getting Alembic Database Version Programmatically
                            
                                how to create a dataframe by repeating series multiple times?
                            
                                How do I run Python script from a subdirectory?
                            
                                Append binary file to another binary file
                            
                                Writing python (pandas) Data Frame to SQL Database Error
                            
                                Evaluating Jacobian at specific points using sympy
                            
                                Generating a retention cohort from a pandas dataframe
                            
                                sqlalchemy.exc.ResourceClosedError: This Connection is closed when inserting after select
                            
                                Get info string from scapy packet
                            
                                Conditionally enumerating items in python
                            
                                Python - iterating beginning with the middle of the list and then checking either side
                            
                                Why is for _ in range(n) slower than for _ in [""]*n?
                            
                                Is this list comprehension pythonic enough? [duplicate]
                            
                                Image rotation in Pillow
                            
                                Using Regex to catch text until first occurrence of certain character
                            
                                Pylab - 'module' object has no attribute 'Figure'
                            
                                How do I design this procedural code as class based (object oriented)?
                            
                                Python: for loops - for i in range(0,len(list) vs for i in list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With