Assuming all you have is the binary data and no pre-canned functions, is there a pattern or algorithm to categorize the type of character?
You ask an API to tell you. In Java, you use the Character class. In C++, you can use ICU. If your language doesn't have this, you download the properties database from unicode.org and incorporate it.
In other words, there is no pattern or algorithm. There are tables published by the Unicode consortium that contain the information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With