Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Index of duplicates items in a python list

Tags:

python

Does anyone know how I can get the index position of duplicate items in a python list? I have tried doing this and it keeps giving me only the index of the 1st occurrence of the of the item in the list.

List = ['A', 'B', 'A', 'C', 'E'] 

I want it to give me:

index 0: A    index 2: A 
like image 264
user674864 Avatar asked Mar 24 '11 12:03

user674864


People also ask

How do I get the index of duplicate items in a list?

Method #1 : Using loop + set() In this, we just insert all the elements in set and then compare each element's existence in actual list. If it's the second occurrence or more, then index is added in result list.

How do you find the index of an element in a list Python?

To find the index of an element in a list, you use the index() function. It returns 3 as expected. However, if you attempt to find an element that doesn't exist in the list using the index() function, you'll get an error. To fix this issue, you need to use the in operator.


2 Answers

You want to pass in the optional second parameter to index, the location where you want index to start looking. After you find each match, reset this parameter to the location just after the match that was found.

def list_duplicates_of(seq,item):     start_at = -1     locs = []     while True:         try:             loc = seq.index(item,start_at+1)         except ValueError:             break         else:             locs.append(loc)             start_at = loc     return locs  source = "ABABDBAAEDSBQEWBAFLSAFB" print(list_duplicates_of(source, 'B')) 

Prints:

[1, 3, 5, 11, 15, 22] 

You can find all the duplicates at once in a single pass through source, by using a defaultdict to keep a list of all seen locations for any item, and returning those items that were seen more than once.

from collections import defaultdict  def list_duplicates(seq):     tally = defaultdict(list)     for i,item in enumerate(seq):         tally[item].append(i)     return ((key,locs) for key,locs in tally.items()                              if len(locs)>1)  for dup in sorted(list_duplicates(source)):     print(dup) 

Prints:

('A', [0, 2, 6, 7, 16, 20]) ('B', [1, 3, 5, 11, 15, 22]) ('D', [4, 9]) ('E', [8, 13]) ('F', [17, 21]) ('S', [10, 19]) 

If you want to do repeated testing for various keys against the same source, you can use functools.partial to create a new function variable, using a "partially complete" argument list, that is, specifying the seq, but omitting the item to search for:

from functools import partial dups_in_source = partial(list_duplicates_of, source)  for c in "ABDEFS":     print(c, dups_in_source(c)) 

Prints:

A [0, 2, 6, 7, 16, 20] B [1, 3, 5, 11, 15, 22] D [4, 9] E [8, 13] F [17, 21] S [10, 19] 
like image 175
PaulMcG Avatar answered Sep 20 '22 11:09

PaulMcG


>>> def indices(lst, item): ...   return [i for i, x in enumerate(lst) if x == item] ...  >>> indices(List, "A") [0, 2] 

To get all duplicates, you can use the below method, but it is not very efficient. If efficiency is important you should consider Ignacio's solution instead.

>>> dict((x, indices(List, x)) for x in set(List) if List.count(x) > 1) {'A': [0, 2]} 

As for solving it using the index method of list instead, that method takes a second optional argument indicating where to start, so you could just repeatedly call it with the previous index plus 1.

>>> List.index("A") 0 >>> List.index("A", 1) 2 
like image 41
Lauritz V. Thaulow Avatar answered Sep 23 '22 11:09

Lauritz V. Thaulow