I expected the following two tuples <pre class="prettyprint"><code>>>> x = tuple(set([1, "a", "b", "c", "z", "f"])) >>> y = tuple(set(["a", "b", "c", "z", "f", 1])) </code></pre> to compare unequal, but they don't: <pre class="prettyprint"><code>>>> x == y >>> True </code></pre> Why is that?

There are two things at play here. <ol> <li>Sets are unordered. <code>set([1, "a", "b", "c", "z", "f"])) == set(["a", "b", "c", "z", "f", 1])</code></li> <li>When you convert a set to a tuple via the <code>tuple</code> constructor it essentially iterates over the set and adds each element returned by the iteration .</li> </ol> The constructor syntax for tuples is <pre class="prettyprint"><code>tuple(iterable) -> tuple initialized from iterable's items </code></pre> Calling <code>tuple(set([1, "a", "b", "c", "z", "f"]))</code> is the same as calling <code>tuple([i for i in set([1, "a", "b", "c", "z", "f"])])</code> The values for <pre class="prettyprint"><code>[i for i in set([1, "a", "b", "c", "z", "f"])] </code></pre> and <pre class="prettyprint"><code>[i for i in set(["a", "b", "c", "z", "f", 1])] </code></pre> are the same as it iterates over the same set. EDIT thanks to @ZeroPiraeus (check his answer ). This is not guaranteed. The value of the iteration will not always be the same even for the same set. The tuple constructor doesn't know the order in which the set is constructed.

Why are tuples constructed from differently initialized sets equal?

Tags:

python

comparison

hashtable

tuples

set

I expected the following two tuples

>>> x = tuple(set([1, "a", "b", "c", "z", "f"])) >>> y = tuple(set(["a", "b", "c", "z", "f", 1]))

to compare unequal, but they don't:

>>> x == y >>> True

Why is that?

890

asked Sep 30 '14 08:09

Ashish Anand

2 Answers

At first glance, it appears that x should always equal y, because two sets constructed from the same elements are always equal:

>>> x = set([1, "a", "b", "c", "z", "f"]) >>> y = set(["a", "b", "c", "z", "f", 1]) >>> x {1, 'z', 'a', 'b', 'c', 'f'} >>> y {1, 'z', 'a', 'b', 'c', 'f'} >>> x == y True

However, it is not always the case that tuples (or other ordered collections) constructed from two equal sets are equal.

In fact, the result of your comparison is sometimes True and sometimes False, at least in Python >= 3.3. Testing the following code:

# compare.py x = tuple(set([1, "a", "b", "c", "z", "f"])) y = tuple(set(["a", "b", "c", "z", "f", 1])) print(x == y)

... a thousand times:

$ for x in {1..1000} > do >   python3.3 compare.py > done | sort | uniq -c 147 False 853 True

This is because, since Python 3.3, the hash values of strings, bytes and datetimes are randomized as a result of a security fix. Depending on what the hashes are, "collisions" may occur, which will mean that the order items are stored in the underlying array (and therefore the iteration order) depends on the insertion order.

Here's the relevant bit from the docs:

Security improvements:

Hash randomization is switched on by default.

— https://docs.python.org/3/whatsnew/3.3.html

EDIT: Since it's mentioned in the comments that the True/False ratio above is superficially surprising ...

Sets, like dictionaries, are implemented as hash tables - so if there's a collision, the order of items in the table (and so the order of iteration) will depend both on which item was added first (different in x and y in this case) and the seed used for hashing (different across Python invocations since 3.3). Since collisions are rare by design, and the examples in this question are smallish sets, the issue doesn't arise as often as one might initially suppose.

For a thorough explanation of Python's implementation of dictionaries and sets, see The Mighty Dictionary.

105

answered Sep 27 '22 21:09

Zero Piraeus

There are two things at play here.

Sets are unordered. set([1, "a", "b", "c", "z", "f"])) == set(["a", "b", "c", "z", "f", 1])
When you convert a set to a tuple via the tuple constructor it essentially iterates over the set and adds each element returned by the iteration .

The constructor syntax for tuples is

tuple(iterable) -> tuple initialized from iterable's items

Calling tuple(set([1, "a", "b", "c", "z", "f"])) is the same as calling tuple([i for i in set([1, "a", "b", "c", "z", "f"])])

The values for

[i for i in set([1, "a", "b", "c", "z", "f"])]

and

[i for i in set(["a", "b", "c", "z", "f", 1])]

are the same as it iterates over the same set.

EDIT thanks to @ZeroPiraeus (check his answer ). This is not guaranteed. The value of the iteration will not always be the same even for the same set.

The tuple constructor doesn't know the order in which the set is constructed.

answered Sep 27 '22 19:09

srj

Related questions
                            
                                Disable autouse fixtures on specific pytest marks
                            
                                How to write a simple Bittorrent application?
                            
                                Convert pandas.Series from dtype object to float, and errors to nans
                            
                                How can I send a message to someone with my telegram bot using their Username
                            
                                Load local data files to Colaboratory
                            
                                Can python doctest ignore some output lines?
                            
                                Download a spreadsheet from Google Docs using Python
                            
                                Is it possible to get a list of keywords in Python?
                            
                                How do you create a legend for a contour plot in matplotlib?
                            
                                Are Python sets mutable?
                            
                                How to solve pkg_resources.VersionConflict error during bin/python bootstrap.py -d
                            
                                ArgumentError: relationship expects a class or mapper argument
                            
                                Why is printf() giving a strange output in python?
                            
                                Docker ENV for Python variables
                            
                                Deploying Google Analytics With Django
                            
                                How can I convert Unicode to uppercase to print it?
                            
                                How do I concatenate files in Python?
                            
                                How to Query model where name contains any word in python list?
                            
                                Better way to shuffle two related lists
                            
                                Python insert numpy array into sqlite3 database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With