In Python I can print a unicode character by name (e.g. print(u'\N{snowman}')
). Is there a way I get get a list of all valid names?
Every codepoint has a name, so you are effectively asking for the Unicode standard list of codepoint names (as well as the *list of name aliases, supported by Python 3.3 and up).
Each Python version supports a specific version of the Unicode standard; the unicodedata.unidata_version
attribute tells you which one for a given Python runtime. The above links lead to the latest published Unicode version, replace UCD/latest
in the URLs with the value of unicodedata.unidata_version
for your Python version.
Per codepoint, the unicodedata.name()
function can tell you the official name, and unicodedata.lookup()
gives you the inverse (name to codepoint).
If you want a list of all unicode character names, consider downloading the Unicode Character Database.
It is included in the base repositories of many linux distributions (ex. "unicode-ucd" on RHEL).
The package includes NamesList.txt, which contains the exhaustive list of unicode character names.
Caution: NamesList.txt
need some times to be downloaded (size > 1.5 MB).
Example:
21FE RIGHTWARDS OPEN-HEADED ARROW
21FF LEFT RIGHT OPEN-HEADED ARROW
@@ 2200 Mathematical Operators 22FF
@@+
@ Miscellaneous mathematical symbols
2200 FOR ALL
= universal quantifier
2201 COMPLEMENT
x (latin letter stretched c - 0297)
2202 PARTIAL DIFFERENTIAL
2203 THERE EXISTS
= existential quantifier
2204 THERE DOES NOT EXIST
: 2203 0338
2205 EMPTY SET
= null set
* used in linguistics to indicate a null morpheme or phonological "zero"
x (latin capital letter o with stroke - 00D8)
x (diameter sign - 2300)
~ 2205 FE00 zero with long diagonal stroke overlay form
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With