For example: <pre class="prettyprint"><code>>>> x = [1, 1, 2, 'a', 'a', 3] >>> unique(x) [1, 2, 'a', 3] </code></pre> Assume list elements are hashable. Clarification: The result should keep the first duplicate in the list. For example, [1, 2, 3, 2, 3, 1] becomes [1, 2, 3].

Using: <pre class="prettyprint"><code>lst = [8, 8, 9, 9, 7, 15, 15, 2, 20, 13, 2, 24, 6, 11, 7, 12, 4, 10, 18, 13, 23, 11, 3, 11, 12, 10, 4, 5, 4, 22, 6, 3, 19, 14, 21, 11, 1, 5, 14, 8, 0, 1, 16, 5, 10, 13, 17, 1, 16, 17, 12, 6, 10, 0, 3, 9, 9, 3, 7, 7, 6, 6, 7, 5, 14, 18, 12, 19, 2, 8, 9, 0, 8, 4, 5] </code></pre> And using the timeit module: <pre class="prettyprint"><code>$ python -m timeit -s 'import uniquetest' 'uniquetest.etchasketch(uniquetest.lst)' </code></pre> And so on for the various other functions (which I named after their posters), I have the following results (on my first generation Intel MacBook Pro): <pre class="prettyprint"><code>Allen: 14.6 µs per loop [1] Terhorst: 26.6 µs per loop Tarle: 44.7 µs per loop ctcherry: 44.8 µs per loop Etchasketch 1 (short): 64.6 µs per loop Schinckel: 65.0 µs per loop Etchasketch 2: 71.6 µs per loop Little: 89.4 µs per loop Tyler: 179.0 µs per loop </code></pre> [1] Note that Allen modifies the list in place – I believe this has skewed the time, in that the <code>timeit</code> module runs the code 100000 times and 99999 of them are with the dupe-less list. <hr> Summary: Straight-forward implementation with sets wins over confusing one-liners :-)

In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique *while preserving order*? [duplicate]

For example:

>>> x = [1, 1, 2, 'a', 'a', 3] >>> unique(x) [1, 2, 'a', 3]

Assume list elements are hashable.

Clarification: The result should keep the first duplicate in the list. For example, [1, 2, 3, 2, 3, 1] becomes [1, 2, 3].

How do I remove duplicates from a list and keep it in Python?

If you want to preserve the order while you remove duplicate elements from List in Python, you can use the OrderedDict class from the collections module. More specifically, we can use OrderedDict. fromkeys(list) to obtain a dictionary having duplicate elements removed, while still maintaining order.

def unique(items):     found = set()     keep = []      for item in items:         if item not in found:             found.add(item)             keep.append(item)                  return keep  print unique([1, 1, 2, 'a', 'a', 3])

Using:

lst = [8, 8, 9, 9, 7, 15, 15, 2, 20, 13, 2, 24, 6, 11, 7, 12, 4, 10, 18, 13, 23, 11, 3, 11, 12, 10, 4, 5, 4, 22, 6, 3, 19, 14, 21, 11, 1, 5, 14, 8, 0, 1, 16, 5, 10, 13, 17, 1, 16, 17, 12, 6, 10, 0, 3, 9, 9, 3, 7, 7, 6, 6, 7, 5, 14, 18, 12, 19, 2, 8, 9, 0, 8, 4, 5]

And using the timeit module:

$ python -m timeit -s 'import uniquetest' 'uniquetest.etchasketch(uniquetest.lst)'

And so on for the various other functions (which I named after their posters), I have the following results (on my first generation Intel MacBook Pro):

Allen:                  14.6 µs per loop [1] Terhorst:               26.6 µs per loop Tarle:                  44.7 µs per loop ctcherry:               44.8 µs per loop Etchasketch 1 (short):  64.6 µs per loop Schinckel:              65.0 µs per loop Etchasketch 2:          71.6 µs per loop Little:                 89.4 µs per loop Tyler:                 179.0 µs per loop

[1] Note that Allen modifies the list in place – I believe this has skewed the time, in that the timeit module runs the code 100000 times and 99999 of them are with the dupe-less list.

Summary: Straight-forward implementation with sets wins over confusing one-liners :-)

In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order? [duplicate]

Tags:

python

algorithm

list

J Miller

People also ask

2 Answers

Terhorst

John Fouhy

Recent Activity

Donate For Us