Which is the better Unicode Normalization Form?

1 Answers

For what? Saving a file, use NFC as the web character model uses it (strictly, the W3C normalisation insists that both the stream be in NFC and also that when entities in HTML or XML are converted to the characters they represent, that it is still in NFC). The odds that it'll ever make a practical difference are slim, though it could stop a few rather obscure issues upsetting someone down the line.

Normalisation makes certain equivalent sequences result in identical streams. For example, U+0065 (e) followed by U+0301 (a combining acute accent) is equivalent to U+00E9 (é) on its own.

NFD splits all such strings up into their component parts (e.g. turning U+00E9 into U+0065 followed by U+0301). If there are two or more combining characters in a row, they are re-ordered according to rules that give a consistency (ḉ could have the cedilla followed by the accute or the accute followed by the cedilla, and we need a consistent ordering to have the same string produced). Mostly NFD is useful for internal processing as part of another task, such as stripping accents, or producing NFC.

NFC starts with NFD and then combines the characters together again where possible, barring a few exceptions to ensure that what was a normalised string with one version of Unicode remains so with another.

NFKD goes further than NFD in replacing certain similar characters with each other. ⁵ for example is replaced with 5. This "damages" the text (a user may reasonably choose ⁵ over 5 for a good reason) but is useful for searching (search for "fiſh" on google and it returns results for "fish" because it treats the long-s the same as a short-s) and as a restriction in certain cases to avoid security issues with similar but different characters. NKFC first does NFKD and then combines in the same manner as NFC.

http://unicode.org/reports/tr15/ for the full skinny, and "use NFC but don't worry about it" to repeat the short answer.

179

answered Oct 10 '22 21:10

Jon Hanna

Related questions
                            
                                Difference between form_for and form_tag?
                            
                                Uploading file in Rails gives String filename instead of File or StringIO object
                            
                                spring-form:options tag with enum
                            
                                Is it Possible to Style a Disabled INPUT element with CSS?
                            
                                Rails 3 - how to save (un)checked checkboxes?
                            
                                Hit Enter key to Check or select checkbox
                            
                                How to switch from table to div for FORM layout?
                            
                                Best way to add an extra (nested) form in the middle of a tabbed form
                            
                                Rails - drop down from an array of strings
                            
                                How to check if the required attribute is set on a field
                            
                                jQuery Validation: $.data($('form')[0], 'validator').settings returns undefined
                            
                                How to iterate formgroup with array in Angular2
                            
                                Windows Forms look different in Powershell and Powershell ISE. Why?
                            
                                Javascript change fields value by name
                            
                                How can I refresh a form page after the form submits to _blank?
                            
                                How to test form request rules in Laravel 5?
                            
                                Intercept form POST string and send via AJAX instead
                            
                                How can I control the height of an Option element in Webkit?
                            
                                Could not load type "text" in vendor/symfony/symfony/src/Symfony/Component/Form/FormRegistry.php at line 91
                            
                                Using spring:message to define form tag attribute in Spring web application

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Which is the better Unicode Normalization Form?

Tags:

forms

normalization

unicode-normalization

dreamweaver

Miki

People also ask

1 Answers

Jon Hanna

Recent Activity

Donate For Us