Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

re.sub(...) replacing leftmost occurrences?

Tags:

python

regex

$ pydoc re.sub :

sub(pattern, repl, string, count=0, flags=0)
    Return the string obtained by replacing the leftmost
    non-overlapping occurrences of the pattern in string by the
    replacement repl.

>>> re.sub('ROAD', 'RD.', 'BRRROADBBRROAD ROAD ROAD MY ROAD')
'BRRRD.BBRRD. RD. RD. MY RD.'

I don't quite understand the meaning of leftmost in the python documentation. As far as I can see, it seems re.sub(...) is replacing all occurrences of pattern with repl

like image 964
Vaibhav Bajpai Avatar asked Jul 10 '11 06:07

Vaibhav Bajpai


People also ask

Does re sub replace all occurrences?

By default, the count is set to zero, which means the re. sub() method will replace all pattern occurrences in the target string.

What is R IN RE sub?

The r prefix is part of the string syntax. With r , Python doesn't interpret backslash sequences such as \n , \t etc inside the quotes. Without r , you'd have to type each backslash twice in order to pass it to re. sub .


2 Answers

Note the 's' ending leftmost non-overlapping occurrences.

re.sub replaces all occurrences. You can use the optional count argument to limit the amount of replacements it does.

"Leftmost non-overlapping" means that if several occurrences are overlapping and can be potentially replaced, only the leftmost will:

>>> str = 'AABBBBAA'
>>> re.sub('BBB', 'CCC', str)
'AACCCBAA'

As you can see, there two (overlapping) occurrences of BBB here. Only the leftmost is replaced.

like image 72
Eli Bendersky Avatar answered Sep 19 '22 19:09

Eli Bendersky


You can see what leftmost means in this example

>>> import re
>>> re.sub('haha', 'Z', 'hahaha')
'Zha'

Note we did not see 'haZ' which would have been rightmost substitution.

like image 25
Ray Toal Avatar answered Sep 19 '22 19:09

Ray Toal