Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove all text between the outer parentheses in a string?

When I have a string like this:

s1 = 'stuff(remove_me)'

I can easily remove the parentheses and the text within using

# returns 'stuff'
res1 = re.sub(r'\([^)]*\)', '', s1)

as explained here.

But I sometimes encounter nested expressions like this:

s2 = 'stuff(remove(me))'

When I run the command from above, I end up with

'stuff)'

I also tried:

re.sub('\(.*?\)', '', s2)

which gives me the same output.

How can I remove everything within the outer parentheses - including the parentheses themselves - so that I also end up with 'stuff' (which should work for arbitrarily complex expressions)?

like image 447
Cleb Avatar asked May 30 '16 14:05

Cleb


People also ask

How do I get rid of text between parentheses in Python?

If you want to remove the [] and the () you can use this code: >>> import re >>> x = "This is a sentence.

How do I remove parentheses from a string?

Using the replace() Function to Remove Parentheses from String in Python. In Python, we use the replace() function to replace some portion of a string with another string. We can use this function to remove parentheses from string in Python by replacing their occurrences with an empty character.

How do you remove content inside brackets without removing brackets in Python?

Method 1: We will use sub() method of re library (regular expressions). sub(): The functionality of sub() method is that it will find the specific pattern and replace it with some string. This method will find the substring which is present in the brackets or parenthesis and replace it with empty brackets.


1 Answers

NOTE: \(.*\) matches the first ( from the left, then matches any 0+ characters (other than a newline if a DOTALL modifier is not enabled) up to the last ), and does not account for properly nested parentheses.

To remove nested parentheses correctly with a regular expression in Python, you may use a simple \([^()]*\) (matching a (, then 0+ chars other than ( and ) and then a )) in a while block using re.subn:

def remove_text_between_parens(text):
    n = 1  # run at least once
    while n:
        text, n = re.subn(r'\([^()]*\)', '', text)  # remove non-nested/flat balanced parts
    return text

Bascially: remove the (...) with no ( and ) inside until no match is found. Usage:

print(remove_text_between_parens('stuff (inside (nested) brackets) (and (some(are)) here) here'))
# => stuff   here

A non-regex way is also possible:

def removeNestedParentheses(s):
    ret = ''
    skip = 0
    for i in s:
        if i == '(':
            skip += 1
        elif i == ')'and skip > 0:
            skip -= 1
        elif skip == 0:
            ret += i
    return ret

x = removeNestedParentheses('stuff (inside (nested) brackets) (and (some(are)) here) here')
print(x)              
# => 'stuff   here'

See another Python demo

like image 109
Wiktor Stribiżew Avatar answered Sep 25 '22 12:09

Wiktor Stribiżew