I'm new to learning Unicode, and not sure how much I have to learn based on my ASCII background, but I'm reading the C# spec on rules for identifiers to determine what chars are permitted within Azure Table (which is directly based on the C# spec).
Where can I find a list of Unicode characters that fall into these categories:
letter-character
: A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nlcombining-character
: A Unicode character of classes Mn or Mcdecimal-digit-character
: A Unicode character of the class Ndconnecting-character
: A Unicode character of the class Pc formatting-character
: A Unicode character of the class Cf To insert a Unicode character, type the character code, press ALT, and then press X. For example, to type a dollar symbol ($), type 0024, press ALT, and then press X. For more Unicode character codes, see Unicode character code charts by script.
Q: How many characters are in Unicode? The short answer is that as of Version 14.0, the Unicode Standard contains 144,697 characters.
Unicode Character “Z” (U+005A)
There are 33 characters classified as ASCII Punctuation & Symbols are also sometimes referred to as ASCII special characters.
You can retrieve this information in an automated fashion from the official Unicode data file, UnicodeData.txt
, which is published here:
This is a file with semicolon-separated values in each line. The third column tells you the character class of each character.
The benefit of this is that you can get the character name for each character, so you have a better idea of what it is than by just looking at the character itself (e.g. would you know what ბ is? That’s right, it’s Ban. In Georgian. :-)
)
FileFormat.info has a list of Unicode characters by category:
http://www.fileformat.info/info/unicode/category/index.htm
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With