Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove special characters from rows in pandas dataframe

I have a column in pandas data frame like the one shown below;

LGA

Alpine (S)
Ararat (RC)
Ballarat (C)
Banyule (C)
Bass Coast (S)
Baw Baw (S)
Bayside (C)
Benalla (RC)
Boroondara (C)

What I want to do, is to remove all the special characters from the ending of each row. ie. (S), (RC).

Desired output should be;

LGA

Alpine
Ararat
Ballarat
Banyule
Bass Coast
Baw Baw
Bayside
Benalla
Boroondara

I am not quite sure how to get desired output mentioned above.

Any help would be appreciated.

Thanks

like image 984
adey27 Avatar asked Dec 16 '21 05:12

adey27


2 Answers

I have different approach using regex. It will delete anything between brackets:

import re
import pandas as pd
df = {'LGA': ['Alpine (S)', 'Ararat (RC)', 'Bass Coast (S)']  }
df = pd.DataFrame(df)
df['LGA'] = [re.sub("[\(\[].*?[\)\]]", "", x).strip() for x in df['LGA']] # delete anything between brackets
like image 148
Gedas Miksenas Avatar answered Sep 27 '22 23:09

Gedas Miksenas


import pandas as pd
df = {'LGA': ['Alpine (S)', 'Ararat (RC)', 'Bass Coast (S)']  }
df = pd.DataFrame(df)
df[['LGA','throw away']] = df['LGA'].str.split('(',expand=True)
like image 41
Gerrit Avatar answered Sep 27 '22 23:09

Gerrit