Consider:
text = "abcdef"
pattern = "(b|e)cd(b|e)"
repl = [r"\1bla\2", r"\1blabla\2"]
text = re.sub(pattern, lambda m: random.choice(repl), text)
I want to replace matches randomly with entries of a list repl. But when using lambda m: random.choice(repl) as a callback, it doesn't replace \1, \2 etc. with its captures any more, returning "\1bla\2" as plain text.
I've tried to look up re.py on how they do it internally, so I might be able to call the same internal function, but it doesn't seem trivial.
The example above returns a\1bla\2f or a\1blabla\2f while abblaef or abblablaef are valid options in my case.
Note that I'm using a function, because, in case of several matches like text = "abcdef abcdef", it should randomly choose a replacement from repl for every match – instead of using the same replacement for all matches.
If you pass a function you lose the automatic escaping of backreferences. You just get the match object and have to do the work. So you could:
Pick a string in the regex rather than passing a function:
text = "abcdef"
pattern = "(b|e)cd(b|e)"
repl = [r"\1bla\2", r"\1blabla\2"]
re.sub(pattern, random.choice(repl), text)
# 'abblaef' or 'abblablaef'
Or write a function that processes the match object and allows more complex processing. You can take advantage of expand to use back references:
text = "abcdef abcdef"
pattern = "(b|e)cd(b|e)"
def repl(m):
    repl = [r"\1bla\2", r"\1blabla\2"]           
    return m.expand(random.choice(repl))
re.sub(pattern, repl, text)
# 'abblaef abblablaef' and variations
You can, or course, put that function into a lambda:
repl = [r"\1bla\2", r"\1blabla\2"]
re.sub(pattern, lambda m: m.expand(random.choice(repl)), text)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With