Does anyone know how I can get the index position of duplicate items in a python list? I have tried doing this and it keeps giving me only the index of the 1st occurrence of the of the item in the list.
List = ['A', 'B', 'A', 'C', 'E']
I want it to give me:
index 0: A index 2: A
Method #1 : Using loop + set() In this, we just insert all the elements in set and then compare each element's existence in actual list. If it's the second occurrence or more, then index is added in result list.
To find the index of an element in a list, you use the index() function. It returns 3 as expected. However, if you attempt to find an element that doesn't exist in the list using the index() function, you'll get an error. To fix this issue, you need to use the in operator.
You want to pass in the optional second parameter to index, the location where you want index to start looking. After you find each match, reset this parameter to the location just after the match that was found.
def list_duplicates_of(seq,item): start_at = -1 locs = [] while True: try: loc = seq.index(item,start_at+1) except ValueError: break else: locs.append(loc) start_at = loc return locs source = "ABABDBAAEDSBQEWBAFLSAFB" print(list_duplicates_of(source, 'B'))
Prints:
[1, 3, 5, 11, 15, 22]
You can find all the duplicates at once in a single pass through source, by using a defaultdict to keep a list of all seen locations for any item, and returning those items that were seen more than once.
from collections import defaultdict def list_duplicates(seq): tally = defaultdict(list) for i,item in enumerate(seq): tally[item].append(i) return ((key,locs) for key,locs in tally.items() if len(locs)>1) for dup in sorted(list_duplicates(source)): print(dup)
Prints:
('A', [0, 2, 6, 7, 16, 20]) ('B', [1, 3, 5, 11, 15, 22]) ('D', [4, 9]) ('E', [8, 13]) ('F', [17, 21]) ('S', [10, 19])
If you want to do repeated testing for various keys against the same source, you can use functools.partial to create a new function variable, using a "partially complete" argument list, that is, specifying the seq, but omitting the item to search for:
from functools import partial dups_in_source = partial(list_duplicates_of, source) for c in "ABDEFS": print(c, dups_in_source(c))
Prints:
A [0, 2, 6, 7, 16, 20] B [1, 3, 5, 11, 15, 22] D [4, 9] E [8, 13] F [17, 21] S [10, 19]
>>> def indices(lst, item): ... return [i for i, x in enumerate(lst) if x == item] ... >>> indices(List, "A") [0, 2]
To get all duplicates, you can use the below method, but it is not very efficient. If efficiency is important you should consider Ignacio's solution instead.
>>> dict((x, indices(List, x)) for x in set(List) if List.count(x) > 1) {'A': [0, 2]}
As for solving it using the index
method of list
instead, that method takes a second optional argument indicating where to start, so you could just repeatedly call it with the previous index plus 1.
>>> List.index("A") 0 >>> List.index("A", 1) 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With