Optional dot in regex

Say I want to replace all the matches of Mr. and Mr with Mister.

I am using the following regex: \bMr(\.)?\b to match either Mr. or just Mr. Then, I use the re.sub() method to do the replacement.

What is puzzling me is that it is replacing Mr. with Mister.. Why is this keeping the dot . at the end? It looks like it is not matching the Mr\. case but just Mr.

import re
s="a rMr. Nobody Mr. Nobody is Mr Nobody and Mra Nobody."
re.sub(r"\bMr(\.)?\b","Mister", s)


'a rMr. Nobody Mister. Nobody is Mister Nobody and Mra Nobody.'

I also tried with the following, but also without luck:

re.sub(r"\b(Mr\.|Mr)\b","Mister", s)

My desired output is:

'a rMr. Nobody Mister Nobody is Mister Nobody and Mra Nobody.'
                     ^                              ^
                     no dot            this should be kept as it is
fedorqui 'SO stop harming' Avatar asked Nov 13 '14 11:11

fedorqui 'SO stop harming'

2 Answers

I think you want to capture 'Mr' followed by either a '.' or a word boundary:


In use:

>>> import re
>>> re.sub(r"\bMr(?:\.|\b)", "Mister", "a rMr. Nobody Mr. Nobody is Mr Nobody and Mra Nobody.")
'a rMr. Nobody Mister Nobody is Mister Nobody and Mra Nobody.'
jonrsharpe Avatar answered Sep 30 '22 07:09


re.sub(r"\bMr\.|\bMr\b","Mister", s)

Try this.You need to remove \b after .

Output:a rMr. Nobody Mister Nobody is Mister Nobody and Mra Nobody.'

The reason why \bMr(\.)?\b is not working because between . and space there is no word boundary.

There are three different positions that qualify as word boundaries:

  • Before the first character in the string, if the first character is a word character.
  • After the last character in the string, if the last character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.
vks Avatar answered Sep 30 '22 05:09

