find duplicates of items endings in a list

Question

I would like to find items in a list that have duplicate endings within the last 3 characters of the string

I know how to find duplicates using the code below, but need help with code how to find that the last strings of "sara" and "tamara" are the same so that one of the items can be copied to a duplicate_finding list

using the following code, I can only identifiy exact duplicate items of a list:

names = ["tom", "john", "sara" , "tamara" , "tom"]
single_finds = []
duplicate_finds = []

for i in names:
        if i in single_finds:
                duplicate_finds.append(i)
        else:
                single_finds.append(i)

print (single_finds)
print (duplicate_finds)

OUTPUT:

['tom', 'john', 'sara', 'tamara']
['tom']

yatu · Accepted Answer

One approach would be to use itertools.groupby, specifying that we want to group based on the last n characters using the key argument.

Then we can flatten the list removing those sublists with only 1 item using itertools.chain and take a set to remove duplicates (or a list if you want them):

from itertools import groupby, chain
k = lambda x: x[-3:]
l = [list(v) for _,v in groupby(sorted(names, key=k), key=k)]
# [['tamara', 'sara'], ['john'], ['tom', 'tom']]
[i[0] for i in l if len(i) > 1]
# ['tamara', 'tom']

wim · Answer

Accumulate names per-suffix using a dict, and then gather the results:

>>> from collections import defaultdict 
>>> d = defaultdict(list) 
>>> for name in names: 
...     suffix = name[-3:] 
...     d[suffix].append(name) 
... 
>>> for suffix, names in d.items(): 
...     print("-", suffix, ":", *names) 
... 
- tom : tom tom
- ohn : john
- ara : sara tamara

You can partition d.items() into singles and dupes by looking at the len(names) now.

This is an O(n) time-complexity solution, as opposed to groupby-based approaches which require pre-sorting the data at O(n log n).

find duplicates of items endings in a list

Tags:

python

list

sebastian20182018

2 Answers

yatu

wim

Recent Activity

Donate For Us

find duplicates of items endings in a list

Tags:

python

list

sebastian20182018

2 Answers

yatu

wim

Related questions

Recent Activity

Donate For Us