Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retaining order while using Python's set difference

Tags:

python

set

I'm doing a set difference operation in Python:

x = [1, 5, 3, 4]
y = [3]

result = list(set(x) - set(y))
print(result)

I'm getting:

[1, 4, 5]

As you can see, the order of the list elements has changed. How can I retain the list x in original format?

like image 546
Avinash Avatar asked Apr 04 '12 05:04

Avinash


People also ask

Does Python set retain order?

Unlike in a standard set, the order of the data in an ordered set is preserved. We used ordered sets when we needed the order in which we entered the data to be maintained over the course of the program. In an ordered set, looking at the data does not change its order as it would in an unordered set.

Does order matter in a set Python?

They are enclosed by the brackets “{” and “}” which are used to denote a set. The order of elements in the set does not matter.

Are sets more efficient than lists Python?

One of the main advantages of using sets in Python is that they are highly optimized for membership tests. For example, sets do membership tests a lot more efficiently than lists.

Does converting an object to a set maintain the order?

1. Does converting an object to a set maintain the object's order? No. A set is not an ordered data structure, so order is not maintained.


2 Answers

It looks like you need an ordered set instead of a regular set.

>>> x = [1, 5, 3, 4]
>>> y = [3]
>>> print(list(OrderedSet(x) - OrderedSet(y)))
[1, 5, 4]

Python doesn't come with an ordered set, but it is easy to make one:

import collections

class OrderedSet(collections.Set):
    def __init__(self, iterable=()):
        self.d = collections.OrderedDict.fromkeys(iterable)

    def __len__(self):
        return len(self.d)

    def __contains__(self, element):
        return element in self.d

    def __iter__(self):
        return iter(self.d)

Hope this helps :-)

like image 183
Raymond Hettinger Avatar answered Oct 17 '22 22:10

Raymond Hettinger


Sets are unordered, so you will need to put the results back in the correct order after doing your set difference. Fortunately you already have the elements in the order you want, so this is easy.

diff = set(x) - set(y)
result = [o for o in x if o in diff]

But this can be streamlined; you can do the difference as part of the list comprehension (though it is arguably slightly less clear that that's what you're doing).

sety = set(y)
result = [o for o in x if o not in sety]

You could even do it without creating the set from y, but the set will provide fast membership tests, which will save you significant time if either list is large.

like image 42
kindall Avatar answered Oct 17 '22 21:10

kindall