I was trying to write a function to remove duplicates from a list in Python.
But after I did this, I found the list was sorted by converting it into set and back to a list.
Here is the script:
>>> l = [9,10,10,11,1,1,1,22,2,2,2]
>>> s = set(l)
>>> s
set([1, 2, 9, 10, 11, 22])
>>> l2 = list(s)
>>> l2
[1, 2, 9, 10, 11, 22]
>>> l2 = list(set(l))
>>> l2
[1, 2, 9, 10, 11, 22]
>>>
The set s
is ordered (at least ordered when printing it).
Why the set is ordered?
And what is the time complexity if I remove duplicates by running this:
def remove_duplicates(nums):
return list(set(nums))
The running time for the list(set(data))
approach is O(n).
The set appears ordered as an artifact of how integers are hashed. With other inputs, the data will scramble away from sorted order.
To overcome arbitrary ordering, use this idiom which is also O(n): list(collections.OrderedDict.fromkeys(data))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With