Until a few minutes ago, I believed that Perl's $
matches any kind of end of line. Unfortunatly, my assumption turns out to be wrong.
The following script removes the word end only for $string3
.
use warnings;
use strict;
my $string1 = " match to the end" . chr(13);
my $string2 = " match to the end" . chr(13) . chr(10);
my $string3 = " match to the end" . chr(10);
$string1 =~ s/ end$//;
$string2 =~ s/ end$//;
$string3 =~ s/ end$//;
print "$string1\n";
print "$string2\n";
print "$string3\n";
But I am almost 75% sure that I have seen cases where $
matched at least chr(13).chr(10)
.
So, what exactly (and under what circumstances) does the $
atom match?
First of all, it depends on whether the /m
modifier is in effect or not.
With /m
active, it matches before a \n
character or at the end of the string. It's equivalent to (?=\n|\z)
.
Without /m
, it matches before a \n
character if that is the last character of the string, or at the end of the string. It's equivalent to (?=\n?\z)
.
It does not match a generic newline. The \R
metacharacter (introduced in 5.10.0) does that (but without the end-of-string property of $
). You can substitute \R
for \n
in one of the previous equivalencies to get a $
work-alike that does match a generic newline.
Note that \n
is not always chr(10)
. It depends on the platform. Most platforms currently in use have \n
meaning chr(10)
, but that wasn't always the case. For example, on older Macs, \n
was chr(13)
and \r
was chr(10)
.
$
matches only the position before \n
/chr(10)
and not before \r
/chr(13)
.
It's very often misinterpreted to match before a newline
character (in a lot of cases it's not causing problems), but to be strict it matches before a "linefeed" character but not before a carriage return character!
See Regex Tutorial - Start and End of String or Line Anchors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With