Python regular expression matching using $

Question

I'm using Python 2.7.0 and and doing the following in the interpreter:

>>> re.search (r"//\s*.*?$", "//

a12345678", flags=re.MULTILINE|re.DOTALL).group()
'//

a12345678'

This is not what I expected. I though $ would match before the endline, but it included the two endline characters AND text after that?

Surprisingly, this works:

>>> re.search (r"//\s*.*?$", "//1

a12345678", flags=re.MULTILINE|re.DOTALL).group()
'//1'

What am I misunderstanding here about python regular expressions?

Some more info:

>>> re.search(r"//\s*.*", "//
  test").group()
'//
  test'
>>> re.search(r"//\s*.*", "//1
  test").group()
'//1'

This last block of code is without MUTLILINE and DOTALL? What am I misunderstanding here? .* shouldn't be matching the newline, and definitely not go past it, right?

Andrew Clark · Accepted Answer

\s can match newlines, and when you use the re.DOTALL flag . can also match newlines.

In the first case your \s* is greedy, so since the first characters after the // in your string are newlines they will be matched by the \s*, and then the .*? will match the final line so that the $ can match at the very end of the string.

In the second case the \s* cannot match because of the 1 after the //, and the .*? will only match up to just before the first newline since it is lazy.

If you want to match all whitespace except for newlines, you can use [ ] in place of \s. It actually looks like for your examples you will get the expected behavior if you just use the regex //.*?$ with the re.MULTILINE flag enabled (re.DOTALL can be included as well, it will not make a difference in this case).

Python regular expression matching using $

Tags:

python

user2533302

1 Answers

Andrew Clark

Recent Activity

Donate For Us

Python regular expression matching using $

Tags:

python

user2533302

1 Answers

Andrew Clark

Related questions

Recent Activity

Donate For Us