Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to tell if a unicode character is a control, alpha, numeric or symbolic?

Assuming all you have is the binary data and no pre-canned functions, is there a pattern or algorithm to categorize the type of character?

like image 566
Oorang Avatar asked Nov 20 '25 09:11

Oorang


1 Answers

You ask an API to tell you. In Java, you use the Character class. In C++, you can use ICU. If your language doesn't have this, you download the properties database from unicode.org and incorporate it.

In other words, there is no pattern or algorithm. There are tables published by the Unicode consortium that contain the information.

like image 120
bmargulies Avatar answered Nov 21 '25 23:11

bmargulies



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!