Disclaimer: I've cross-posted this over at PerlMonks.
In Perl5, I can quickly and easily print out the hex representation of the \r\n
Windows-style line ending:
perl -nE '/([\r\n]{1,2})/; print(unpack("H*",$1))' in.txt
0d0a
To create a Windows-ending file on Unix if you want to test, create a in.txt
file with a single line and line ending. Then: perl -ni -e 's/\n/\r\n/g;print' in.txt
. (or in vi/vim, create the file and just do :set ff=dos
).
I have tried many things in Perl6 to do the same thing, but I can't get it to work no matter what I do. Here's my most recent test:
use v6;
use experimental :pack;
my $fn = 'in.txt';
my $fh = open $fn, chomp => False; # I've also tried :bin
for $fh.lines -> $line {
if $line ~~ /(<[\r\n]>**1..2)/ {
$0.Str.encode('UTF-8').unpack("H*").say;
}
}
Outputs 0a
, as do:
/(\n)/
/(\v)/
First, I don't even know if I'm using unpack()
or the regex properly. Second, how do I capture both elements (\r\n
) of the newline in P6?
Perl 6 automatically chomps the line separator off for you. Which means it isn't there when you try to do a substitution.
Perl 6 also creates synthetic characters if there are combining characters. so if you want a base 16 representation of your input, use the encoding 'latin1'
or use methods on $*IN
that return a Buf.
This example just appends CRLF to the end of every line.
( The last line will always end with 0D 0A
even if it didn't have a line terminator )
perl6 -ne 'BEGIN $*IN.encoding("latin1"); #`( basically ASCII )
$_ ~= "\r\n"; #`( append CRLF )
put .ords>>.fmt("%02X");'
You could also turn off the autochomp behaviour.
perl6 -ne 'BEGIN {
$*IN.encoding("latin1");
$*IN.chomp = False;
};
s/\n/\r\n/;
put .ords>>.fmt("%02X");'
Ok, so what my goal was (I'm sorry I didn't make that clear when I posted the question) was I want to read a file, capture the line endings, and write the file back out using the original line endings (and not the endings for the current platform).
I got a proof of concept working now. I'm very new to Perl 6, so the code probably isn't very p6-ish, but it does do what I needed it to.
Code tested on FreeBSD:
use v6;
use experimental :pack;
my $fn = 'in.txt';
my $outfile = 'out.txt';
# write something with a windows line ending to a new file
my $fh = open $fn, :w;
$fh.print("ab\r\ndef\r\n");
$fh.close;
# re-open the file
$fh = open $fn, :bin;
my $eol_found = False;
my Str $recsep = '';
# read one byte at a time, or else we'd have to slurp the whole
# file, as I can't find a way to differentiate EOL from EOF
while $fh.read(1) -> $buf {
my $hex = $buf.unpack("H*");
if $hex ~~ /(0d|0a)/ {
$eol_found = True;
$recsep = $recsep ~ $hex;
next;
}
if $eol_found {
if $hex !~~ /(0d|0a)/ {
last;
}
}
}
$fh.close;
my %recseps = (
'0d0a' => "\r\n",
'0d' => "\r",
'0a' => "\n",
);
my $nl = %recseps<<$recsep>>;
# write a new file with the saved record separator
$fh = open $outfile, :w;
$fh.print('a' ~ $nl);
$fh.close;
# re-read file to see if our newline stuck
$fh = open $outfile, :bin;
my $buf = $fh.read(1000);
say $buf;
Output:
Buf[uint8]:0x<61 0d 0a>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With