So I'm trying to write a perl script to read in a file encoded in Latin-1. For some reason, this just isn't working out. When I try to do a simple search for a character that I know is in the file (it's in the first line), nothing shows up. I'm using use encoding "iso 8859-1"; below, but I've also tried binmode(STDIN, ":utf8");. Any suggestions on what I might be doing wrong, and how to make it right? <pre class="prettyprint"><code>use encoding "iso 8859-1"; while(<>) { if(/ó/gi) { print "Found one!\n"; } } </code></pre>

Don’t use the <code>use encoding</code> pragma: it’s broken. Either specify the encoding here: <pre class="prettyprint"><code>use open ":encoding(Latin1)"; </code></pre> or put it in the open itself: <pre class="prettyprint"><code>open(FH, "< :encoding(Latin1)", $pathname) || die "can't open $pathname: $!"; </code></pre> or <code>binmode</code> it after opening: <pre class="prettyprint"><code>binmode(FH, ":encoding(Latin1)") || die "can't binmode to encoding Latin1"; </code></pre> If you’re using <code><ARGV></code>, then <code>use open</code> is probably easiest. Don’t forget to set the encoding on your output streams, too.

How to read in ISO 8859-1 (Latin-1) encoded text in Perl

Tags:

input

encoding

perl

latin1

So I'm trying to write a perl script to read in a file encoded in Latin-1. For some reason, this just isn't working out. When I try to do a simple search for a character that I know is in the file (it's in the first line), nothing shows up. I'm using use encoding "iso 8859-1"; below, but I've also tried binmode(STDIN, ":utf8");. Any suggestions on what I might be doing wrong, and how to make it right?

use encoding "iso 8859-1";

while(<>)
{
    if(/ó/gi)
    {
    print "Found one!\n";
    }
}

424

asked Nov 19 '10 01:11

John Montgomery

1 Answers

Don’t use the use encoding pragma: it’s broken.

Either specify the encoding here:

use open ":encoding(Latin1)";

or put it in the open itself:

open(FH, "< :encoding(Latin1)", $pathname)
   || die "can't open $pathname: $!";

or binmode it after opening:

binmode(FH, ":encoding(Latin1)")
   || die "can't binmode to encoding Latin1";

If you’re using <ARGV>, then use open is probably easiest.

Don’t forget to set the encoding on your output streams, too.

181

answered Sep 30 '22 05:09

tchrist

Related questions
                            
                                How does O=Deparse work, and does Perl have and fold constant arrays?
                            
                                Using Perl to rename files in a directory
                            
                                What is the difference between module and distribution on CPAN?
                            
                                How can I show the query time in Perl, DBI?
                            
                                Using a sorting subroutine from another package
                            
                                How could I hide/protect password from a Perl script
                            
                                Why does loading Test::More eliminate my bug?
                            
                                Do I have to free a HV* created with newHV?
                            
                                Pairs as hash keys
                            
                                Perl: console / command-line tool for interactive code evaluation and testing
                            
                                in perl, how do we detect a segmentation fault in an external command
                            
                                Method invocation does not supply scalar context... seems strange
                            
                                perl using constant in regex
                            
                                What is the difference between base64 and MIME base 64? [closed]
                            
                                In Test::More, is it possible to test a subroutine that exit()'s at the end?
                            
                                How do I match only fully-composed characters in a Unicode string in Perl?
                            
                                How do I do a simple Perl hash equivalence comparison?
                            
                                How can I use a code ref as a callback in Perl?
                            
                                Is there a way to override a Perl "use constant" in your unit testing?
                            
                                Can I pass arguments to the compare subroutine of sort in Perl?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With