This is a sample program I made:
>>> print u'\u1212'
ሒ
>>> print '\u1212'
\u1212
>>> print unicode('\u1212')
\u1212
Why do I get \u1212 instead of ሒ when I print unicode('\u1212')?
I'm making a program to store data and not print it, so how do I store ሒ instead of \u1212? Now obviously I can't do something like:
x = u''+unicode('\u1212')
Interestingly even if I do that, here's what I get:
\u1212
Another fact that I think is worth mentioning :
>>> u'\u1212' == unicode('\u1212')
False
What do I do to store ሒ or some other character like that instead of \uxxxx?
'\u1212' is an ASCII string with 6 characters: \, u, 1, 2, 1, and 2.
unicode('\u1212') is a Unicode string with 6 characters: \, u, 1, 2, 1, and 2
u'\u1212' is a Unicode string with one character: ሒ.
You should use Unicode strings all around, if that's what you want.
u'\u1212'
If for some reason you need to convert '\u1212' to u'\u1212', use
'\u1212'.decode('unicode-escape')
(Note that in Python 3, strings are always Unicode.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With