Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Errors when trying to remove parentheses in python text

I've been working on a bit of code to take a bunch of histograms from other files and plot them together. In order to make sure the legend displays correctly I've been trying to take the titles of these original histograms and cut out a bit of information that isn't needed any more.

The section I don't need takes the form (A mass=200 GeV), I've had no problem removing what's inside the parentheses, unfortunately everything I've tried for the parentheses themselves either has no effect, negates the code that removes the text, or throws errors.

I've tried using suggestions from; Remove parenthesis and text in a file using Python and How can I remove text within parentheses with a regex?

The error my current attempt gives me is

'str' object cannot be interpreted as an integer

This is the section of the code:

histo_name = ''

# this is a list of things we do not want to show up in our legend keys
REMOVE_LIST = ["(A mass = 200 GeV)"]

# these two lines use the re module to remove things from a piece of text
# that are specified in the remove list
remove = '|'.join(REMOVE_LIST)
regex = re.compile(r'\b('+remove+r')\b')

# Creating the correct name for the stacked histogram
for histo in histos:

    if histo == histos[0]:

        # place_holder contains the edited string we want to set the
        # histogram title to
        place_holder = regex.sub('', str(histo.GetName()))
        histo_name += str(place_holder)
        histo.SetTitle(histo_name)

    else:

        place_holder = regex.sub(r'\(\w*\)', '', str(histo.GetName()))
        histo_name += ' + ' + str(place_holder)
        histo.SetTitle(histo_name)

The if/else bit is just because the first histogram I pass in isn't getting stacked so I just want it to keep it's own name, while the rest are stacked in order hence the '+' etc, but I thought I'd include it.

Apologies if I've done something really obvious wrong, I'm still learning!

like image 567
Ciara Avatar asked Feb 21 '26 22:02

Ciara


1 Answers

From the python docs - To match the literals '(' or ')', use \( or \), or enclose them inside a character class: [(] [)].

So use one of the above patterns instead of the plain brackets in your regex. e.g.REMOVE_LIST = ["\(A mass = 200 GeV\)"]

EDIT: The issue seems to be with your use of \b in the regex - which according to the docs linked above also matches the braces. My seemingly-working example is,

import re

# Test input
myTestString = "someMess (A mass = 200 GeV) and other mess (remove me if you can)"
replaceWith = "HEY THERE FRIEND"

# What to remove
removeList = [r"\(A mass = 200 GeV\)", r"\(remove me if you can\)"]

# Build the regex
remove = r'(' + '|'.join(removeList) + r')'
regex = re.compile(remove)

# Try it!
out = regex.sub(replaceWith, myTestString)

# See if it worked
print(out)
like image 91
James Elderfield Avatar answered Feb 23 '26 11:02

James Elderfield



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!