I have a datastructure like this:
[
[('A', '1'), ('B', '2')],
[('A', '1'), ('B', '2')],
[('A', '4'), ('C', '5')]
]
And I want to obtain this:
[
[('A', '1'), ('B', '2')],
[('A', '4'), ('C', '5')]
]
Is there a good way of doing this while preserving order as shown?
Commands for copy-pasting:
sample = []
sample.append([('A', '1'), ('B', '2')])
sample.append([('A', '1'), ('B', '2')])
sample.append([('A', '4'), ('C', '5')])
This is a somewhat famous question which was well answered by a famous Pythonista long ago: http://code.activestate.com/recipes/52560-remove-duplicates-from-a-sequence/
If you can assume equal records are adjacent, there is a recipe in the itertools docs:
from operator import itemgetter
from itertools import groupby, imap
def unique_justseen(iterable, key=None):
"List unique elements, preserving order. Remember only the element just seen."
# unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
# unique_justseen('ABBCcAD', str.lower) --> A B C A D
return imap(next, imap(itemgetter(1), groupby(iterable, key)))
If you can only assume orderable elements, here a variant using the bisect module. Given n inputs with r unique values, its search step costs O(n log r). If a new unique value is found, it is inserted in the seen list for a cost of O(r * r).
from bisect import bisect_left, insort
def dedup(seq):
'Remove duplicates. Preserve order first seen. Assume orderable, but not hashable elements'
result = []
seen = []
for x in seq:
i = bisect_left(seen, x)
if i == len(seen) or seen[i] != x:
seen.insert(i, x)
result.append(x)
return result
Here is an order-preserving variant of the sort/unique idiom. This will give you O(n log n) performance, provided your items are at least sortable.
def unique(a):
indices = sorted(range(len(a)), key=a.__getitem__)
indices = set(next(it) for k, it in
itertools.groupby(indices, key=a.__getitem__))
return [x for i, x in enumerate(a) if i in indices]
Example (with hashable items for simplicity):
>>> a = ['F', 'J', 'B', 'F', 'V', 'A', 'E', 'U', 'B', 'U', 'Z', 'K']
>>> unique(a)
['F', 'J', 'B', 'V', 'A', 'E', 'U', 'Z', 'K']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With