Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dollar sign in regular expression and new line character

Tags:

regex

I know that the dollar sign is used to match the character at the end of the string, to make sure that search does not stop in the middle of the string but instead goes on till the end of the string.

But how does it deal with the newline character, does it match just before the new line character or does it take that into account.

I checked it in eclipse regex, for a regex matching array of strings ([A-Za-z ]+)$\n worked, not the other way around ([A-Za-z ]+\n)$

like image 451
Dude Avatar asked Dec 17 '12 10:12

Dude


2 Answers

Note that ^ and $ are zero-width tokens. So, they don't match any character, but rather matches a position.

  • ^ matches the position before the first character in a string.
  • $ matches the position before the first newline in the string.

So, the String before the $ would of course not include the newline, and that is why ([A-Za-z ]+\n)$ regex of yours failed, and ([A-Za-z ]+)$\n succeeded.

In simple words, your $ should be followed by a newline, and no other character.

like image 175
Rohit Jain Avatar answered Nov 07 '22 11:11

Rohit Jain


If the pattern ends with a newline then $ usually matches before that character. That goes at least for Perl, PCRE, Java and .NET. (edit: as Tim Pietzker points out in a comment, \r is not considered a line break by .NET)

This was introduced, because input that is read from a line is terminated with a newline (at least in Perl), which can be conveniently ignored this way.

Use \z to signify the very end of the string (if it's supported by your regex engine).

Source

like image 41
Martin Ender Avatar answered Nov 07 '22 10:11

Martin Ender