Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the set is ordered when converting a list to set?

Tags:

python

list

set

I was trying to write a function to remove duplicates from a list in Python.

But after I did this, I found the list was sorted by converting it into set and back to a list.

Here is the script:

>>> l = [9,10,10,11,1,1,1,22,2,2,2]
>>> s = set(l)
>>> s
set([1, 2, 9, 10, 11, 22])
>>> l2 = list(s)
>>> l2
[1, 2, 9, 10, 11, 22]
>>> l2 = list(set(l))
>>> l2
[1, 2, 9, 10, 11, 22]
>>> 

The set s is ordered (at least ordered when printing it).

Why the set is ordered?

And what is the time complexity if I remove duplicates by running this:

def remove_duplicates(nums):
    return list(set(nums))
like image 844
xhanshawn Avatar asked Mar 27 '16 00:03

xhanshawn


1 Answers

The running time for the list(set(data)) approach is O(n).

The set appears ordered as an artifact of how integers are hashed. With other inputs, the data will scramble away from sorted order.

To overcome arbitrary ordering, use this idiom which is also O(n): list(collections.OrderedDict.fromkeys(data))

like image 141
Raymond Hettinger Avatar answered Sep 28 '22 07:09

Raymond Hettinger