Proper NFD form of emoji and comparison

Question

Given that there is now a selector for textual vs emoji display for some codepoints, what is the proper decomposed form of those codepoints? For instance, ❤︎ (U+2764) defaults to a text representation, but can become an emoji if followed by VS-16 (U+fe0f): ❤️. You can force a text representation with VS-15 (U+fe0e). Does this mean the NFD for U+2764 should become U+2764 U+fe0e? Should U+2764 U+fe0e and U+2764 be treated as the same (in the same way é (U+00e9) is the same as é (U+0065 U+0301))? What about the text vs emoji representations? Should they be treated the same as well?

nwellnhof · Accepted Answer

There's no decomposition mapping in the Unicode database for emojis and variation selectors. The standard even states:

The initial character in a variation sequence is never [...] a canonical decomposable character.

This means that emojis with or without variation selector don't change under NFD.

Also, to my knowledge, Unicode doesn't specify the default representation of a code point without variation selector. This is up to the implementation.

Proper NFD form of emoji and comparison

Tags:

unicode

nfd

Chas. Owens

1 Answers

nwellnhof

Recent Activity

Donate For Us

Proper NFD form of emoji and comparison

Tags:

unicode

nfd

Chas. Owens

1 Answers

nwellnhof

Related questions

Recent Activity

Donate For Us