How to remove parentheses and all data within using Pandas/Python?

Tags:

I have a dataframe where I want to remove all parentheses and stuff inside it.

I checked out : How can I remove text within parentheses with a regex?

Where the answer to remove the data was

re.sub(r'\([^)]*\)', '', filename)

I tried this as well as

re.sub(r'\(.*?\)', '', filename)

However, I got an error: expected a string or buffer

When I tried using the column df['Column Name'] I got no item named 'Column Name'

I checked the dataframe using df.head() and it showed up as a clean table with the column names as what I wanted them to be....however when I use the re expression to remove the (stuff) it isn't recognizing the column name that I have.

I normally use

df['name'].str.replace(" ()","")

However, I want to remove the parentheses and what is inside....How can I do this using either regex or pandas?

Thanks!

Here is the solution I used...thanks for the help!

All['Manufacturer Standard Name'] = All['Manufacturer Standard Name'].str.replace(r"\(.*\)","")

998

asked Jan 03 '14 00:01

Alexis

2 Answers

df['name'].str.replace(r"\(.*\)","")

You can't run re functions directly on pandas objects. You have to loop them for each element inside the object. So Series.str.replace((r"\(.*\)", "") is just syntactic sugar for Series.apply(lambda x: re.sub(r"\(.*\)", "", x)).

198

answered Sep 26 '22 00:09

dmvianna

If you have multiple (...) substrings in the data you should consider using either

All['Manufacturer Standard Name'] = All['Manufacturer Standard Name'].str.replace(r"\(.*?\)","")

All['Manufacturer Standard Name'] = All['Manufacturer Standard Name'].str.replace(r"\([^()]*\)","")

The difference is that .*? is slower and does not match line breaks, and [^()] matches any char but ( and ) and is quite efficient and matches line breaks. The first one will match (...(...) but the second will only match (...).

If you want to normalize all whitespace after removing these substrings, you may consider

All['Manufacturer Standard Name'] = All['Manufacturer Standard Name'].str.replace(r"\s*\([^()]*\)","").str.strip()

The \s*\([^()]*\) regex will match 0+ whitespaces and then the string between parentheses and then str.stip() will get rid of any potential trailing whitespace.

answered Sep 25 '22 00:09

Wiktor Stribiżew

Related questions
                            
                                All example concurrent.futures code is failing with "BrokenProcessPool"
                            
                                Django: AppRegistryNotReady()
                            
                                Spyder 5 missing dependencies - spyder_kernels version error [closed]
                            
                                What does the ** maths operator do in Python?
                            
                                What is the best way to do automatic attribute assignment in Python, and is it a good idea?
                            
                                Automatically import models on Django shell launch
                            
                                Heroku & Django: "OSError: No such file or directory: '/app/{myappname}/static'"
                            
                                How can I pass parameters to a RequestHandler?
                            
                                How to activate different anaconda environment from powershell
                            
                                How do I set the content-type for POST requests in python-requests library?
                            
                                No module named 'tqdm'
                            
                                Using monotonically_increasing_id() for assigning row number to pyspark dataframe
                            
                                Read a file on App Engine with Python?
                            
                                Use fnmatch.filter to filter files by more than one possible file extension
                            
                                Python: Iterating through a dictionary gives me "int object not iterable"
                            
                                Can Pylint error checking be customized?
                            
                                Beautiful Soup find children for particular div
                            
                                How can I check the existence of attributes and tags in XML before parsing?
                            
                                Unpivot Pandas Data
                            
                                Using openpyxl to read file from memory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove parentheses and all data within using Pandas/Python?

Tags:

python

regex

pandas

removeall

Alexis

People also ask

2 Answers

dmvianna

Wiktor Stribiżew

Recent Activity

Donate For Us