Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex, how to delete all matches from a string

Tags:

I have a list of regex patterns.

rgx_list = ['pattern_1', 'pattern_2', 'pattern_3']

And I am using a function to loop through the list, compile the regex's, and apply a findall to grab the matched terms and then I would like a way of deleting said terms from the text.

def clean_text(rgx_list, text):
    matches = []
    for r in rgx_list:
        rgx = re.compile(r)
        found_matches = re.findall(rgx, text)
        matches.append(found_matches)

I want to do something like text.delete(matches) so that all of the matches will be deleted from the text and then I can return the cleansed text.

Does anyone know how to do this? My current code will only work for one match of each pattern, but the text may have more than one occurence of the same pattern and I would like to eliminate all matches.

like image 435
eggman Avatar asked May 12 '16 16:05

eggman


People also ask

How do I remove a matched string in Python?

In Python you can use the replace() and translate() methods to specify which characters you want to remove from the string and return a new modified string result. It is important to remember that the original string will not be altered because strings are immutable. Here is the basic syntax for the replace() method.

How do you strip a regular expression in Python?

Remove multiple characters from string using regex in python Then sub() function should replace all those characters by an empty string i.e. It removed all the occurrences of character 's', 'a' and 'i' from the string.

How do I remove a character from a string in regex?

If you are having a string with special characters and want's to remove/replace them then you can use regex for that. Use this code: Regex. Replace(your String, @"[^0-9a-zA-Z]+", "")


1 Answers

Use sub to replace matched patterns with an empty string. No need to separately find the matches first.

def clean_text(rgx_list, text):
    new_text = text
    for rgx_match in rgx_list:
        new_text = re.sub(rgx_match, '', new_text)
    return new_text
like image 162
Matt S Avatar answered Jan 08 '23 12:01

Matt S