Consider the following multiline string:
>> print s
shall i compare thee to a summer's day?
thou art more lovely and more temperate
rough winds do shake the darling buds of may,
and summer's lease hath all too short a date.
re.sub()
replaces all the occurrence of and
with AND
:
>>> print re.sub("and", "AND", s)
shall i compare thee to a summer's day?
thou art more lovely AND more temperate
rough winds do shake the darling buds of may,
AND summer's lease hath all too short a date.
But re.sub()
doesn't allow ^
anchoring to the beginning of the line, so adding it causes no occurrence of and
to be replaced:
>>> print re.sub("^and", "AND", s)
shall i compare thee to a summer's day?
thou art more lovely and more temperate
rough winds do shake the darling buds of may,
and summer's lease hath all too short a date.
How can I use re.sub()
with start-of-line (^
) or end-of-line ($
) anchors?
match() function of re in Python will search the regular expression pattern and return the first occurrence. The Python RegEx Match method checks for a match only at the beginning of the string. So, if a match is found in the first line, it returns the match object.
sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string. To use this function, we need to import the re module first.
Summary: The caret operator ^ matches at the beginning of a string. The dollar-sign operator $ matches at the end of a string. If you want to match at the beginning or end of each line in a multi-line string, you can set the re.
You can use negative character sets, or [^things to not match] . In this case, you want to not match | , so you would have [^|] .
You forgot to enable multiline mode.
re.sub("^and", "AND", s, flags=re.M)
re.M
re.MULTILINE
When specified, the pattern character
'^'
matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character'$'
matches at the end of the string and at the end of each line (immediately preceding each newline). By default,'^'
matches only at the beginning of the string, and'$'
only at the end of the string and immediately before the newline (if any) at the end of the string.
source
The flags argument isn't available for python older than 2.7; so in those cases you can set it directly in the regular expression like so:
re.sub("(?m)^and", "AND", s)
Add (?m)
for multiline:
print re.sub(r'(?m)^and', 'AND', s)
See the re documentation here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With