Given a list of lists, I want to make sure that there are no two lists that have the same values and order. For instance with <code>my_list = [[1, 2, 4, 6, 10], [12, 33, 81, 95, 110], [1, 2, 4, 6, 10]]</code> it is supposed to return me the existence of duplicate lists, i.e. <code>[1, 2, 4, 6, 10]</code>. I used <code>while</code> but it doesn't work as I want. Does someone know how to fix the code: <pre class="prettyprint"><code>routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]] r = len(routes) - 1 i = 0 while r != 0: if cmp(routes[i], routes[i + 1]) == 0: print "Yes, they are duplicate lists!" r -= 1 i += 1 </code></pre>

you could count the occurrences in a list comprehension, converting them to a <code>tuple</code> so you can hash & apply unicity: <pre class="prettyprint"><code>routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]] dups = {tuple(x) for x in routes if routes.count(x)>1} print(dups) </code></pre> result: <pre class="prettyprint"><code>{(1, 2, 4, 6, 10)} </code></pre> Simple enough, but a lot of looping under the hood because of repeated calls to <code>count</code>. There's another way, which involves hashing but has a lower complexity would be to use <code>collections.Counter</code>: <pre class="prettyprint"><code>from collections import Counter routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]] c = Counter(map(tuple,routes)) dups = [k for k,v in c.items() if v>1] print(dups) </code></pre> Result: <pre class="prettyprint"><code>[(1, 2, 4, 6, 10)] </code></pre> (Just count the tuple-converted sublists - fixing the hashing issue -, and generate dup list using list comprehension, keeping only items which appear more than once) Now, if you just want to detect that there are some duplicate lists (without printing them) you could <ul> <li>convert the list of lists to a list of tuples so you can hash them in a set</li> <li>compare the length of the list vs the length of the set:</li> </ul> len is different if there are some duplicates: <pre class="prettyprint"><code>routes_tuple = [tuple(x) for x in routes] print(len(routes_tuple)!=len(set(routes_tuple))) </code></pre> or, being able to use <code>map</code> in Python 3 is rare enough to be mentionned so: <pre class="prettyprint"><code>print(len(set(map(tuple,routes))) != len(routes)) </code></pre>

Checking if a list has duplicate lists

Tags:

python

list

duplicates

Given a list of lists, I want to make sure that there are no two lists that have the same values and order. For instance with my_list = [[1, 2, 4, 6, 10], [12, 33, 81, 95, 110], [1, 2, 4, 6, 10]] it is supposed to return me the existence of duplicate lists, i.e. [1, 2, 4, 6, 10].

I used while but it doesn't work as I want. Does someone know how to fix the code:

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
r = len(routes) - 1
i = 0
while r != 0:
    if cmp(routes[i], routes[i + 1]) == 0:
        print "Yes, they are duplicate lists!"
    r -= 1
    i += 1

334

asked Jan 23 '17 15:01

Don

1 Answers

you could count the occurrences in a list comprehension, converting them to a tuple so you can hash & apply unicity:

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]
dups = {tuple(x) for x in routes if routes.count(x)>1}

print(dups)

result:

{(1, 2, 4, 6, 10)}

Simple enough, but a lot of looping under the hood because of repeated calls to count. There's another way, which involves hashing but has a lower complexity would be to use collections.Counter:

from collections import Counter

routes = [[1, 2, 4, 6, 10], [1, 3, 8, 9, 10], [1, 2, 4, 6, 10]]

c = Counter(map(tuple,routes))
dups = [k for k,v in c.items() if v>1]

print(dups)

Result:

[(1, 2, 4, 6, 10)]

(Just count the tuple-converted sublists - fixing the hashing issue -, and generate dup list using list comprehension, keeping only items which appear more than once)

Now, if you just want to detect that there are some duplicate lists (without printing them) you could

convert the list of lists to a list of tuples so you can hash them in a set
compare the length of the list vs the length of the set:

len is different if there are some duplicates:

routes_tuple = [tuple(x) for x in routes]    
print(len(routes_tuple)!=len(set(routes_tuple)))

or, being able to use map in Python 3 is rare enough to be mentionned so:

print(len(set(map(tuple,routes))) != len(routes))

157

answered Oct 10 '22 04:10

Jean-François Fabre

Related questions
                            
                                Django settings Unknown parameters: TEMPLATE_DEBUG
                            
                                Signal handler inside a class
                            
                                What's the difference in behaviour between :func: and :meth: roles in Python Sphinx?
                            
                                Qt5: AttributeError: 'module' object has no attribute 'QApplication'
                            
                                Python: Merge list with range list
                            
                                Import Error: Google Analytics API Authorization
                            
                                Adding installed PIP package to path automatically
                            
                                Merging two dictionaries while keeping the original
                            
                                django social authentication error using python-social-auth
                            
                                PySpark DataFrame unable to drop duplicates
                            
                                How can I move a field from one model to another, and still retain the data?
                            
                                shape vs len for numpy array
                            
                                Scikit-learn using GridSearchCV on DecisionTreeClassifier
                            
                                filename.whl is not a supported wheel on this platform
                            
                                TensorFlow getting elements of every row for specific columns
                            
                                Plotting a dataframe as both a 'hist' and 'kde' on the same plot
                            
                                How to hash strings into a float in [0:1]?
                            
                                Permission denied: 'geckodriver.log' while running selenium webdriver in python
                            
                                How to show decimal point only when it's not a whole number?
                            
                                PySpark DataFrame filter using logical AND over list of conditions -- Numpy All Equivalent

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With