Does anyone know for sure if setting $/="\R";
will reliably let chomp() do the correct thing, that is remove whatever end-of-line conventions are on a line?
Specifically, I run scripts on Windows and UNIX and have to process files that come off of the net, and have unknown end-of-line conventions: MS-DOS, UNIX, MacOS < 9, whatever.
I recently stumbled on "\R", but I hadn't seen it before. I think it's new. Well, newer than Perl 5.006. (It's been a while.)
The "\R" claims to do Unicode newlines, as well. I have no way to test this correctly.
Thanks.
-Erik
I was surprised to learn there's actually a "newline" tag in stackoverflow.
Will setting $/='\R' allow chomp() to work correctly with most files in perl?
Setting $/
to '\R'
will consider the two-character sequence "\\R"
as newline.
Setting $/
to "\R"
will result in a warning about an Unrecognized escape
.
\R
is not a string but has a meaning only in the context of regular expressions. But the documentation for $/
clearly states:
Remember: the value of
$/
is a string, not a regex. awk has to be better for something. :-)
I created Acme::InputRecordSeparatorIsRegexp
a while ago as a joke, but it does provide a workaround for the restriction that $/
cannot be a regular expression. With version 0.04 (just uploaded), you can say
use Acme::InputRecordSeparatorIsRegexp ':all';
open my $fh, '<:irs(\R)', 'file-with-ambiguous-line-endings.txt';
autochomp($fh,1); # or (tied *$fh)->autochomp(1)
@lines = <$fh>;
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With