Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace multiple matches / groups with regexes?

Normally we would write the following to replace one match:

namesRegex = re.compile(r'(is)|(life)', re.I)
replaced = namesRegex.sub(r"butter", "There is no life in the void.")
print(replaced)

output:
There butter no butter in the void.

What i want is to replace, probably using back references, each group with a specific text. Namely i want to replace the first group (is) with "are" and the second group (life) with "butterflies".

Maybe something like that. But the following is not working code.

namesRegex = re.compile(r'(is)|(life)', re.I)
replaced = namesRegex.sub(r"(are) (butterflies)", r"\1 \2", "There is no life in the void.")
print(replaced)

Is there a way to replace multiple groups in one statement in python?

like image 476
KeyC0de Avatar asked Jul 17 '17 10:07

KeyC0de


3 Answers

You can use a replacement by lambda, mapping the keywords you want to associate:

>>> re.sub(r'(is)|(life)', lambda x: {'is': 'are', 'life': 'butterflies'}[x.group(0)], "There is no life in the void.")
'There are no butterflies in the void.'
like image 123
Uriel Avatar answered Oct 09 '22 17:10

Uriel


You can define a map of keys and replacements first and then use a lambda function in replacement:

>>> repl = {'is': 'are', 'life': 'butterflies'}
>>> print re.sub(r'is|life', lambda m: repl[m.group()], "There is no life in the void.")
There are no butterflies in the void.

I will also suggest you to use word boundaries around your keys to safeguard your search patterns:

>>> print re.sub(r'\b(?:is|life)\b', lambda m: repl[m.group()], "There is no life in the void.")
There are no butterflies in the void.
like image 4
anubhava Avatar answered Oct 09 '22 18:10

anubhava


You may use a dictionary with search-replacement values and use a simple \w+ regex to match words:

import re
dt = {'is' : 'are', 'life' : 'butterflies'}
namesRegex = re.compile(r'\w+')
replaced = namesRegex.sub(lambda m: dt[m.group()] if m.group() in dt else m.group(), "There is no life in the void.")
print(replaced)

See a Python demo

With this approach, you do not have to worry about creating a too large regex pattern based on alternation. You may adjust the pattern to include word boundaries, or only match letters (e.g. [\W\d_]+), etc. as per the requirements. The main point is that the pattern should match all the search terms that are keys in the dictionary.

The if m.group() in dt else m.group() part is checking if the found match is present as a key in the dictionary, and if it is not, just returns the match back. Else, the value from the dictionary is returned.

like image 2
Wiktor Stribiżew Avatar answered Oct 09 '22 19:10

Wiktor Stribiżew