Suppose that I have the following sentence:
bean likes to sell his beans
and I want to replace all occurrences of specific words with other words. For example, bean
to robert
and beans
to cars
.
I can't just use str.replace
because in this case it'll change the beans
to roberts
.
>>> "bean likes to sell his beans".replace("bean","robert")
'robert likes to sell his roberts'
I need to change the whole words only, not the occurrences of the word in the other word. I think that I can achieve this by using regular expressions but don't know how to do it right.
If you use regex, you can specify word boundaries with \b
:
import re
sentence = 'bean likes to sell his beans'
sentence = re.sub(r'\bbean\b', 'robert', sentence)
# 'robert likes to sell his beans'
Here 'beans' is not changed (to 'roberts') because the 's' on the end is not a boundary between words: \b
matches the empty string, but only at the beginning or end of a word.
The second replacement for completeness:
sentence = re.sub(r'\bbeans\b', 'cars', sentence)
# 'robert likes to sell his cars'
If you replace each word one at a time, you might replace words several times (and not get what you want). To avoid this, you can use a function or lambda:
d = {'bean':'robert', 'beans':'cars'}
str_in = 'bean likes to sell his beans'
str_out = re.sub(r'\b(\w+)\b', lambda m:d.get(m.group(1), m.group(1)), str_in)
That way, once bean
is replaced by robert
, it won't be modified again (even if robert
is also in your input list of words).
As suggested by georg, I edited this answer with dict.get(key, default_value)
.
Alternative solution (also suggested by georg):
str_out = re.sub(r'\b(%s)\b' % '|'.join(d.keys()), lambda m:d.get(m.group(1), m.group(1)), str_in)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With