I'm trying to read a downloaded html-file
my $file = "sn.html";
my $in_fh = open $file, :r;
my $text = $in_fh.slurp;
and I get the following error message:
Malformed UTF-8
in block <unit> at prog.p6 line 10
How to avoid this and get access to the file's contents?
For slurp, if you have some idea about encoding, you can also add encoding specifically.
From documentation (https://docs.perl6.org/routine/slurp
):
my $text_contents = slurp "path/to/file", enc => "latin1";
I used it today for a stupid file encoded in ISO-8859-1.
If you do not specify an encoding when opening a file, it will assume utf8
. Apparently, the file that you wish to open, contains bytes that cannot be interpreted as UTF-8. Hence the error message.
Depending on what you want to do with the file contents, you could either set the :bin
named parameter, to have the file opened in binary mode. Or you could use the special utf8-c8
encoding, which will assume UTF-8 until it encounters bytes it cannot encode: in that case it will generate temporary code points.
See https://docs.raku.org/language/unicode#UTF8-C8 for more information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With