Is the order in which combining diacritic marks appear after a codepoint important?

Question

I wonder if the order in which combining diacritic marks appear after a codepoint changes the way how the diacritics should be stacked above or below the character; or if there is another semantic difference.

Does normalization specify some way to reorder diacritics, e. g. to speed up String comparison?

Joachim Sauer · Accepted Answer

According to this Wikipedia article the order of combining characters is relevant in some cases and should be normalized as specified in other cases.

Concretely the order of combining characters with the same combining class must be preserved (i.e. it is relevant), while the groups of characters must be sorted by their combining class.

bobince · Answer

Yes, it's important, and it has to be in order to make some cases unambiguous:

Normal form D: U, U+0308, U+0304 -> Normal form C U+01D6 Latin Small Letter U With Diaeresis And Macron ǖ
Normal form D: U, U+0304, U+0308 -> Normal form C U+1E7B Latin Small Letter U With Macron And Diaeresis ṻ

In general within a combining class you start closer to the letter and work away from it.

Is the order in which combining diacritic marks appear after a codepoint important?

Tags:

string

semantics

unicode

standards

diacritics

soc

2 Answers

Joachim Sauer

bobince

Recent Activity

Donate For Us

Is the order in which combining diacritic marks appear after a codepoint important?

Tags:

string

semantics

unicode

standards

diacritics

soc

2 Answers

Joachim Sauer

bobince

Related questions

Recent Activity

Donate For Us