Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I find the duplicates in a list and create another list with them?

How can I find the duplicates in a Python list and create another list of the duplicates? The list only contains integers.

like image 792
MFB Avatar asked Mar 23 '12 07:03

MFB


People also ask

How do you go through a list and search if there is a duplicate element in Java?

One of the most common ways to find duplicates is by using the brute force method, which compares each element of the array to every other element. This solution has the time complexity of O(n^2) and only exists for academic purposes.


2 Answers

To remove duplicates use set(a). To print duplicates, something like:

a = [1,2,3,2,1,5,6,5,5,5]  import collections print([item for item, count in collections.Counter(a).items() if count > 1])  ## [1, 2, 5] 

Note that Counter is not particularly efficient (timings) and probably overkill here. set will perform better. This code computes a list of unique elements in the source order:

seen = set() uniq = [] for x in a:     if x not in seen:         uniq.append(x)         seen.add(x) 

or, more concisely:

seen = set() uniq = [x for x in a if x not in seen and not seen.add(x)]     

I don't recommend the latter style, because it is not obvious what not seen.add(x) is doing (the set add() method always returns None, hence the need for not).

To compute the list of duplicated elements without libraries:

seen = set() dupes = []  for x in a:     if x in seen:         dupes.append(x)     else:         seen.add(x) 

or, more concisely:

seen = set() dupes = [x for x in a if x in seen or seen.add(x)]     

If list elements are not hashable, you cannot use sets/dicts and have to resort to a quadratic time solution (compare each with each). For example:

a = [[1], [2], [3], [1], [5], [3]]  no_dupes = [x for n, x in enumerate(a) if x not in a[:n]] print no_dupes # [[1], [2], [3], [5]]  dupes = [x for n, x in enumerate(a) if x in a[:n]] print dupes # [[1], [3]] 
like image 137
12 revs, 8 users 71% Avatar answered Oct 03 '22 00:10

12 revs, 8 users 71%


A very simple solution, but with complexity O(n*n)

>>> l = [1,2,3,4,4,5,5,6,1] >>> set([x for x in l if l.count(x) > 1]) set([1, 4, 5]) 
like image 35
Ritesh Kumar Avatar answered Oct 03 '22 00:10

Ritesh Kumar