Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is a string in a compatibility normal form already in the corresponding canonical normal form?

My tests tell me that, as of Unicode 6.2, all characters in full compatibility decompositions have the property NFD_Quick_Check=Yes.

This leads me to believe that isNFKD(x) implies isNFD(x), and isNFKC(x) implies isNFC(x).

Are my conclusions correct? And what about stability? Are these implications guaranteed to hold for future versions of the Unicode standard?

like image 371
R. Martinho Fernandes Avatar asked Mar 28 '13 23:03

R. Martinho Fernandes


1 Answers

Your conclusions are correct. Section Design Goals of Unicode Standard Annex #15 states:

toNFKC(x) = toNFC(toNFKC(x))
toNFKD(x) = toNFD(toNFKD(x))

With regard to stability, this will hold true for future versions of Unicode if the normalized string doesn't contain any unassigned code points.

like image 97
nwellnhof Avatar answered Oct 07 '22 06:10

nwellnhof