Why should you recompose Unicode (NFC) on the way out?

Tags:

perl

TomC recommends decomposing Unicode characters on the way in, and recomposing on the way out (http://www.perl.com/pub/2012/04/perl-unicode-cookbook-always-decompose-and-recompose.html).

The former makes perfect sense to me, but I can't see why he recommends recomposing on the way out. Potentially you could save a small amount of space if your text is heavy with European accented characters, but you're just pushing that on to someone else's decomposition function.

Are there any other obvious reasons I'm missing?

665

asked Apr 04 '12 13:04

petersergeant

2 Answers

As Ven'Tatsu writes in a comment, there is software that can handle composed characters but not decomposed characters. Though the opposite is theoretically possible too, I have never seen it in practice and expect it to be rare.

To just display a decomposed character, the rendering software needs to deal with combining diacritic marks. It does not suffice to find them in the font. The renderer needs to position the diacritic properly, using information about the dimensions of the base character. There are often problems with this, resulting in poor rendering—especially if the rendering uses the diacritic from a different font! The result can hardly be better than what is achieved by simply displaying the glyph of a precomposed character like “é”, designed by a typographer.

(Rendering software can also analyze the situation and effectively map the decomposed character to a precomposed character. But that would require extra code.)

answered Nov 15 '22 04:11

Jukka K. Korpela

It's quite simple: Most tools have limited Unicode support; they assume characters are in the NFC form.

For example, this is commonly how people compare strings:

perl -CSDA -e"use utf8; if ($ARGV[0] eq "Éric") { ... }"

And of course, the "É" is in NFC form (since that's what almost everything produces), so this program only accepts arguments in NFC form.

answered Nov 15 '22 06:11

ikegami

Related questions
                            
                                SOAP::Lite Generating <c-gensym .. > how do I get rid of it?
                            
                                Playing Sound in Perl script
                            
                                Is there runtime flow chart for Perl?
                            
                                Support autovivified filehandle as arguments to Perl XS routine
                            
                                in perl(v5.14.2), why is map{+0,0}() correct but map{0,0}() not? [duplicate]
                            
                                Making an old library work with Perl XS and PerlIO
                            
                                Using Perl's ExtUtils::MakeMaker, how can I compile an executable using the same settings as my XS module?
                            
                                I/O with a Tun interface
                            
                                Windows 7 daylight savings bug?
                            
                                Shutting down a Mojo::IOLoop recurring event connected to a Mojo websocket
                            
                                How to install perl 5.10 from MinGW installation manager?
                            
                                How to send a custom http status code with mod_perl
                            
                                What's a good way to process RTF-encoded files and convert them to XML?
                            
                                How do I make a DBIx::Class relationship with a fixed join condition?
                            
                                app that auto-generates CRUD UI for database table
                            
                                How can I identify the "tokens" (wrong word) of a regular expression
                            
                                Trying to install MinGW and Tk for Perl on Windows 7
                            
                                How do I use the debugger with mod_perl
                            
                                Perl Net::OAuth2 example code
                            
                                How to investigate " Attempt to free unreferenced scalar"

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With