Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does \w+ match a trailing newline?

Tags:

python

regex

I am curious why the following would output that there was a match:

import re

foo = 'test\n'
match = re.search('^\w+$', foo)

if match == None:
  print "It did not match"
else:
  print "Match!"

The newline is before the end of the string, yes? Why is this matching?

like image 616
Jessie A. Morris Avatar asked Jul 08 '11 23:07

Jessie A. Morris


2 Answers

From Python's re documentation.

'$'
Matches the end of the string or just before the newline at the end of the string, and in MULTILINE mode also matches before a newline. foo matches both ‘foo’ and ‘foobar’, while the regular expression foo$ matches only ‘foo’. More interestingly, searching for foo.$ in 'foo1\nfoo2\n' matches ‘foo2’ normally, but ‘foo1’ in MULTILINE mode; searching for a single $ in 'foo\n' will find two (empty) matches: one just before the newline, and one at the end of the string.

like image 136
Andrew Clark Avatar answered Nov 17 '22 01:11

Andrew Clark


^ and $ mean "start of line" and "end of line", not "start of string" and "end of string". Use \A for "start of string" and \Z for "end of string".

like image 29
Paige Ruten Avatar answered Nov 17 '22 01:11

Paige Ruten