Removing elements that have consecutive partial duplicates in Python

Question

My question is similar to this, but instead of removing full duplicates I'd like to remove consecutive partial "duplicates" from a list in python.

For my particular use case, I want to remove words from a list that start consecutive with the same character, and I want to be able to define that character. For this example it's #, so

['#python', 'is', '#great', 'for', 'handling', 
'text', '#python', '#text', '#nonsense', '#morenonsense', '.']

should become

['#python', 'is', '#great', 'for', 'handling', 'text', '.']

tobias_k · Accepted Answer

You could use itertools.groupby:

>>> from itertools import groupby
>>> lst = ['#python', 'is', '#great', 'for', 'handling', 'text', '#python', '#text', '#nonsense', '#morenonsense', '.']    
>>> [s for k, g in ((k, list(g)) for k, g in groupby(lst, key=lambda s: s.startswith("#")))
...    if not k or len(g) == 1 for s in g]
...
['#python', 'is', '#great', 'for', 'handling', 'text', '.']

This groups elements by whether they start with a #, then uses only those elements that do not or where the group only has a single element.

jpp · Answer

Here's one solution using itertools.groupby. The idea is to group items depending on whether the first character is equal to a given k. Then apply your 2 criteria; if they are not satisfied, you can yield the items.

L = ['#python', 'is', '#great', 'for', 'handling', 'text',
     '#python', '#text', '#nonsense', '#morenonsense', '.']

from itertools import chain, groupby

def list_filter(L, k):
    grouper = groupby(L, key=lambda x: x[0]==k)
    for i, j in grouper:
        items = list(j)
        if not (i and len(items) > 1):
            yield from items

res = list_filter(L, '#')

print(list(res))

['#python', 'is', '#great', 'for', 'handling', 'text', '.']

Removing elements that have consecutive partial duplicates in Python

Tags:

python

list

duplicates

Moritz

2 Answers

tobias_k

jpp

Recent Activity

Donate For Us

Removing elements that have consecutive partial duplicates in Python

Tags:

python

list

duplicates

Moritz

2 Answers

tobias_k

jpp

Related questions

Recent Activity

Donate For Us