Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are all characters that the Unicode Other_ID_Start and Other_ID_Continue properties include?

Tags:

unicode

While reading about ID_Start and ID_Continue definitions, I found this: https://unicode.org/reports/tr31/#D1. It says that ID_Start includes Other_ID_Start and ID_Continue includes Other_ID_Continue. I'm unable to find the definitions of these other. The document I mentioned says that they're defined by UAX44. So for example, I tried to consult Unicode 15 version of UAX44: https://www.unicode.org/reports/tr44/tr44-30.html The table 9 (Property Table) only says:

Other_ID_Start Used to maintain backward compatibility of ID_Start.

Other than that, there is no additional information. What am I missing?

like image 767
Hydroper Avatar asked Oct 24 '25 15:10

Hydroper


1 Answers

Other_ID_Start and Other_ID_Continue, like most binary character properties, are defined in the PropList.txt data file in the Unicode Character Database.

In particular, Other_ID_Start includes characters that used to be included in ID_Start automatically due to some other property they possessed, but now need to be specified manually because said property value has since changed. For example, U+212E ESTIMATED SYMBOL was originally classified as a letter and all letters are ID_Start by default, but later it was reclassified as a symbol and thus would have been excluded if it weren’t for the backwards compatibility requirement.

like image 111
CharlotteBuff Avatar answered Oct 27 '25 02:10

CharlotteBuff



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!