It is possible to write Perl documentation in UTF-8. To do it you should write in your POD: <pre class="prettyprint"><code>=encoding NNN </code></pre> But what should you write instead <code>NNN</code>? Different sources gives different answers. <ul> <li> perlpod says that that should be <code>=encoding utf8</code> </li> <li> this stackoverflow answer states that it should be <code>=encoding UTF-8</code> </li> <li>and this answer tells me to write <code>=encoding utf-8</code> </li> </ul> What is the correct answer? What is the correct string to be written in POD?

<pre class="prettyprint"><code>=encoding UTF-8 </code></pre> According to IANA, charset names are case-insensitive, so <code>utf-8</code> is the same. <code>utf8</code> is Perl's lax variant of UTF-8. However, for safety, you want to be strict to your POD processors.

What string should be used to specify encoding in Perl POD, "utf8", "UTF-8" or "utf-8"?

Tags:

encoding

documentation

utf-8

perl

It is possible to write Perl documentation in UTF-8. To do it you should write in your POD:

=encoding NNN

But what should you write instead NNN? Different sources gives different answers.

perlpod says that that should be =encoding utf8
this stackoverflow answer states that it should be =encoding UTF-8
and this answer tells me to write =encoding utf-8

What is the correct answer? What is the correct string to be written in POD?

853

asked Aug 07 '13 16:08

bessarabov

2 Answers

=encoding UTF-8

According to IANA, charset names are case-insensitive, so utf-8 is the same.

utf8 is Perl's lax variant of UTF-8. However, for safety, you want to be strict to your POD processors.

194

answered Sep 21 '22 13:09

daxim

As daxim points out, I have been misled. =encoding=UTF-8 and =encoding=utf-8 apply the strict encoding, and =encoding=utf8 is the lenient encoding:

$ cat enc-test.pod
=encoding ENCNAME

=head1 TEST '\344\273\245\376\202\200\200\200\200\200'

=cut

(here \xxx means the literal byte with value xxx. \344\273\245 is a valid UTF-8 sequence, \376\202\200\200\200\200\200 is not)

`=encoding=utf-8`:

$ perl -pe 's/ENCNAME/utf-8/' enc-test.pod | pod2cpanhtml | grep /h1
>TEST &#39;&#20197;&#27492;&#65533;&#39;</a></h1>

`=encoding=utf8`:

$ perl -pe 's/ENCNAME/utf8/' enc-test.pod | pod2cpanhtml | grep /h1
Code point 0x80000000 is not Unicode, no properties match it; ...
Code point 0x80000000 is not Unicode, no properties match it; ...
Code point 0x80000000 is not Unicode, no properties match it; ...
>TEST &#39;&#20197;&#2147483648;&#39;</a></h1>

They are all equivalent. The argument to =encoding is expected to be a name recognized by the Encode::Supported module. When you drill down into that document, you see

the canonical encoding name is utf8
the name UTF-8 is an alias for utf8, and
names are case insensitive, so utf-8 is equivalent to UTF-8

What's the best practice? I'm not sure. I don't think you go wrong using the official IANA name (as per daxim's answer), but you can't go wrong following the official Perl documentation, either.

answered Sep 17 '22 13:09

mob

Related questions
                            
                                perl using constant in regex
                            
                                What is the difference between base64 and MIME base 64? [closed]
                            
                                In Test::More, is it possible to test a subroutine that exit()'s at the end?
                            
                                How do I match only fully-composed characters in a Unicode string in Perl?
                            
                                How do I do a simple Perl hash equivalence comparison?
                            
                                How can I use a code ref as a callback in Perl?
                            
                                Is there a way to override a Perl "use constant" in your unit testing?
                            
                                Can I pass arguments to the compare subroutine of sort in Perl?
                            
                                How to read in ISO 8859-1 (Latin-1) encoded text in Perl
                            
                                How is writing a C interface easier in Ruby than Perl?
                            
                                how to source a shell script [environment variables] in perl script without forking a subshell?
                            
                                perl -a: How to change column separator?
                            
                                Initializing Perl variables using eval
                            
                                Perl: How do I declare empty array refs in a new hash?
                            
                                Is this line of Perl meaningless? s/^(\d+)\b/$1/sg
                            
                                relative file paths in perl
                            
                                How to detect if the script is running on a virtual machine?
                            
                                Need to ping 1000 urls every 2 minutes
                            
                                Open a directory and sort files by date created
                            
                                Perl readdir in order

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What string should be used to specify encoding in Perl POD, "utf8", "UTF-8" or "utf-8"?

Tags:

encoding

documentation

utf-8

perl

bessarabov

People also ask

2 Answers

daxim

`=encoding=utf-8`:

`=encoding=utf8`:

mob

Recent Activity

Donate For Us

What string should be used to specify encoding in Perl POD, "utf8", "UTF-8" or "utf-8"?

Tags:

encoding

documentation

utf-8

perl

bessarabov

People also ask

2 Answers

daxim

=encoding=utf-8:

=encoding=utf8:

mob

Related questions

Recent Activity

Donate For Us

`=encoding=utf-8`:

`=encoding=utf8`: