I want to remove the few words in a column and I have written below code which is working fine
finaldata['keyword'] = finaldata['keyword'].str.replace("Washington Times", "")
finaldata['keyword'] = finaldata['keyword'].str.replace("Washington Post", "")
finaldata['keyword'] = finaldata['keyword'].str.replace("Mail The Globe", "")
Now I have around 30 words to remove but I can't repeat this line of code 30 times Is there any way to solve my issue if yes please guide me
You can use regex here and reduce this to a single replace
call.
words = ["Washington Times", "Washington Post", "Mail The Globe"]
p = '|'.join(words)
finaldata['keyword'] = finaldata['keyword'].str.replace(p, '')
For performance, if the data has no NaNs, you should consider using a list comprehension.
import re
p2 = re.compile(p)
finaldata['keyword'] = [p2.replace('', text) for text in finaldata['keyword']]
If there are NaNs, you can use select and use loc
to reassign:
m = finaldata['keyword'].notna()
finaldata.loc[m, 'keyword'] = [
p2.replace('', text) for text in finaldata.loc[m, 'keyword'].tolist()]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With