Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Checking if a list has duplicate lists

Given a list of lists, I want to make sure that there are no two lists that have the same values and order. For instance with my_list = [[1, 2, 4, 6, 10], [12, 33, 81, 95, 110], [1, 2, 4, 6, 10]] it is supposed to return me the existence of duplicate lists, i.e. [1, 2, 4, 6, 10].

I used while but it doesn't work as I want. Does someone know how to fix the code:

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
r = len(routes) - 1
i = 0
while r != 0:
    if cmp(routes[i], routes[i + 1]) == 0:
        print "Yes, they are duplicate lists!"
    r -= 1
    i += 1
like image 334
Don Avatar asked Jan 23 '17 15:01

Don


People also ask

How do I check if a list has duplicates?

For this, we will use the count() method. The count() method, when invoked on a list, takes the element as input argument and returns the number of times the element is present in the list. For checking if the list contains duplicate elements, we will count the frequency of each element.

How do you check if there is a duplicate in a list Java?

One more way to detect duplication in the java array is adding every element of the array into HashSet which is a Set implementation. Since the add(Object obj) method of Set returns false if Set already contains an element to be added, it can be used to find out if the array contains duplicates in Java or not.


1 Answers

you could count the occurrences in a list comprehension, converting them to a tuple so you can hash & apply unicity:

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
dups = {tuple(x) for x in routes if routes.count(x)>1}

print(dups)

result:

{(1, 2, 4, 6, 10)}

Simple enough, but a lot of looping under the hood because of repeated calls to count. There's another way, which involves hashing but has a lower complexity would be to use collections.Counter:

from collections import Counter

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]

c = Counter(map(tuple,routes))
dups = [k for k,v in c.items() if v>1]

print(dups)

Result:

[(1, 2, 4, 6, 10)]

(Just count the tuple-converted sublists - fixing the hashing issue -, and generate dup list using list comprehension, keeping only items which appear more than once)

Now, if you just want to detect that there are some duplicate lists (without printing them) you could

  • convert the list of lists to a list of tuples so you can hash them in a set
  • compare the length of the list vs the length of the set:

len is different if there are some duplicates:

routes_tuple = [tuple(x) for x in routes]    
print(len(routes_tuple)!=len(set(routes_tuple)))

or, being able to use map in Python 3 is rare enough to be mentionned so:

print(len(set(map(tuple,routes))) != len(routes))
like image 157
Jean-François Fabre Avatar answered Oct 10 '22 04:10

Jean-François Fabre