Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl \R regex strip Windows newline character

I'm using a Perl script using the following code to remove possible Windows newline characters in input file:

foreach my $line(split /\r|\R/)

Executing the same script on two different Linux machines has different results. On machine1 script works as intended, on machine2 every time a capital "R" character is found line is split and result is messed.

I would like to know if the \R regex is correct and how to make machine2 behave as intended.

like image 260
gmarco Avatar asked Jun 04 '15 14:06

gmarco


People also ask

How do I remove a carriage return in Perl?

$str =~ s/\r//g; Carriage returns and linefeeds are removed by combining \r and \n in that order, of course, since that's the order in which they appear in text files, like those that are created on Windows systems.

What is \W in Perl regex?

A \w matches a single alphanumeric character (an alphabetic character, or a decimal digit) or _ , not a whole word. Use \w+ to match a string of Perl-identifier characters (which isn't the same as matching an English word).

How do I match a character in Perl?

m operator in Perl is used to match a pattern within the given text. The string passed to m operator can be enclosed within any character which will be used as a delimiter to regular expressions.


1 Answers

In Perl, there are several differences in the way carriage returns can be handled:

\n matches a line-feed (newline) character (ASCII 10)
\r matches a carriage return (ASCII 13)
\R matches any Unicode newline sequence; can be modified using verbs

Windows uses the two characters ASCII 13+ASCII 10 (\r\n) and unix uses ASCII 10 (\n). The \R expression matches any Unicode newline sequence (\r, \n, \r\n).

The likely reason \R works on one machine and not the other might be differing versions of Perl. \R was introduced in perl 5.10.0, so if the other machine is using an older version then updating should solve your issue.

More info:

  • Perl Regular Expression Backslash Sequences and Escapes
  • Syntax of Regular Expressions in Perl
like image 135
l'L'l Avatar answered Sep 29 '22 05:09

l'L'l