Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python list(set(list(...)) to remove duplicates

Tags:

python

list

set

Is

list(set(some_list))

a good way to remove duplicates from a list? (Python 3.3 if that matters)

(Edited to address some of the comments... it was perhaps too terse before).

Specifically,

  • is it at least comparable, in terms of efficiency (mainly speed but also memory), if not better than writing ones own algorithm; it's clearly the most concise code
  • is it reliable? any situations where it breaks? (one has been mentioned already ... list items need to be hashable)
  • is there a more Pythonesque way of doing it?
like image 810
RFlack Avatar asked Oct 01 '15 04:10

RFlack


3 Answers

The method you show is probably shortest and easiest to understand; that would make it Pythonic by most definitions.

If you need to preserve the order of the list, you can use collections.OrderedDict instead of set:

list(collections.OrderedDict((k, None) for k in some_list).keys())

Edit: as of Python 3.7 (or 3.6 if you're trusting) it's not necessary to use OrderedDict; a regular dict shares the property of retaining insertion order. So you can rewrite the above:

list({k: None for k in some_list}.keys())

If the elements aren't hashable but can be sorted, you can use itertools.groupby to remove duplicates:

list(k for k,g in itertools.groupby(sorted(some_list)))

Edit: the above can be written as a list comprehension which some might consider more Pythonic.

[k for k,_ in itertools.groupby(sorted(some_list))]
like image 125
Mark Ransom Avatar answered Oct 07 '22 11:10

Mark Ransom


(As suggested in the comments, adding this comment as an answer as well.)

Your own solution looks good and pretty Pythonic to me. If you're using Numpy, you can also do new_list = numpy.unique(some_list). This more or less 'reads like a sentence', which I believe is always a good benchmark for something being "Pythonic".

like image 42
EelkeSpaak Avatar answered Oct 07 '22 12:10

EelkeSpaak


To preserve order the shortest (starting from Python 2.7):

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

If there is no need to preserve order list(set(...)) is just fine.

like image 1
Alexander Trakhimenok Avatar answered Oct 07 '22 13:10

Alexander Trakhimenok