I am very new a Python
I want to change sentence if there are repeated words.
Correct
Right now am I using this reg. but it do all so change on letters. Ex. "My friend and i is happy" --> "My friend and is happy" (it remove the "i" and space) ERROR
text = re.sub(r'(\w+)\1', r'\1', text) #remove duplicated words in row
How can I do the same change but instead of letters it have to check on words?
You can remove duplicates using a Python set or the dict. fromkeys() method. The dict. fromkeys() method converts a list into a dictionary.
It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s) , even though it's enclosed by () it won't appear in the list of matches, only (\w+) will.
To replace a string in Python, the regex sub() method is used. It is a built-in Python method in re module that returns replaced string. Don't forget to import the re module. This method searches the pattern in the string and then replace it with a new given expression.
text = re.sub(r'\b(\w+)( \1\b)+', r'\1', text) #remove duplicated words in row
The \b
matches the empty string, but only at the beginning or end of a word.
Non- regex solution using itertools.groupby
:
>>> strs = "this is just is is"
>>> from itertools import groupby
>>> " ".join([k for k,v in groupby(strs.split())])
'this is just is'
>>> strs = "this just so so so nice"
>>> " ".join([k for k,v in groupby(strs.split())])
'this just so nice'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With