I wonder why when I make:
a = [u'k',u'ę',u'ą']
and then type:
'k' in a
I get True
, while:
'ę' in a
will give me False
?
It really gives me headache and it seems someone made this on purpose to make people mad...
And why is this?
In Python 2.x, you can't compare unicode to string directly for non-ascii characters. This will raise a warning:
Warning (from warnings module):
File "__main__", line 1
UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
However, in Python 3.x this doesn't appear, as all strings are unicode objects.
Solution?
You can either make the string unicode:
>>> u'ç' in a
True
Now, you're comparing both unicode objects, not unicode to string.
Or convert both to an encoding, for example utf-8 before comparing:
>>> c = u"ç"
>>> u'ç'.encode('utf-8') == c.encode('utf-8')
True
Also, to use non-ascii characters in your program, you'll have to specify the encoding, at the top of the file:
# -*- coding: utf-8 -*-
#the whole program
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With