I have a list of tuples whose elements are like this:
aa = [('a', 'b'), ('c', 'd'), ('b', 'a')]
I want to treat ('a', 'b') and ('b', 'a')
as the same group and want to extract only unique tuples. So the output should be like this:
[('a', 'b'), ('c', 'd')]
How can I achieve this efficiently as my list consists of millions of such tuples?
Convert to a frozenset
, hash, and retrieve:
In [193]: map(tuple, set(map(frozenset, aa))) # python2
Out[193]: [('d', 'c'), ('a', 'b')]
Here's a slightly more readable version with a list comprehension:
In [194]: [tuple(x) for x in set(map(frozenset, aa))]
Out[194]: [('d', 'c'), ('a', 'b')]
Do note that, for your particular use case, a list of tuples isn't the best choice of data structure. Consider storing your data as a set to begin with?
In [477]: set(map(frozenset, aa))
Out[477]: {frozenset({'a', 'b'}), frozenset({'c', 'd'})}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With