I have a python string that I need to remove parentheses. The standard way is to use text = re.sub(r'\([^)]*\)', '', text), so the content within the parentheses will be removed.
However, I just found a string that looks like (Data with in (Boo) And good luck). With the regex I use, it will still have And good luck) part left. I know I can scan through the entire string and try to keep a counter of number of ( and ) and when the numbers are balanced, index the location of ( and ) and remove the content within middle, but is there a better/cleaner way for doing that? It doesn't need to be regex, whatever it will work is great, thanks.
Someone asked for expected result so here's what I am expecting:
Hi this is a test ( a b ( c d) e) sentence
Post replace I want it to be Hi this is a test sentence, instead of Hi this is a test e) sentence
With the re module (replace the innermost parenthesis until there's no more replacement to do):
import re
s = r'Sainte Anne -(Data with in (Boo) And good luck) Charenton'
nb_rep = 1
while (nb_rep):
(s, nb_rep) = re.subn(r'\([^()]*\)', '', s)
print(s)
With the regex module that allows recursion:
import regex
s = r'Sainte Anne -(Data with in (Boo) And good luck) Charenton'
print(regex.sub(r'\([^()]*+(?:(?R)[^()]*)*+\)', '', s))
Where (?R) refers to the whole pattern itself.
First I split the line into tokens that do not contain the parenthesis, for later on joining them into a new line:
line = "(Data with in (Boo) And good luck)"
new_line = "".join(re.split(r'(?:[()])',line))
print ( new_line )
# 'Data with in Boo And good luck'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With