with using python 2.7:
>myCity = 'Isparta'
>myCity.lower()
>'isparta'
#-should be-
>'ısparta'
tried some decoding, (like, myCity.decode("utf-8").lower()) but could not find how to do it.
how can lower this kinds of letters? ('I' > 'ı', 'İ' > 'i' etc)
EDIT: In Turkish, lower case of 'I' is 'ı'. Upper case of 'i' is 'İ'
UTF8 does not work for turkish characters.
Python ascii() Method. ASCII stands for American Standard Code for Information Interchange. It is a character encoding standard that uses numbers from 0 to 127 to represent English characters. For example, ASCII code for the character A is 65, and 90 is for Z. Similarly, ASCII code 97 is for a, and 122 is for z.
UTF-8 is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the '8' means that 8-bit values are used in the encoding. (There are also UTF-16 and UTF-32 encodings, but they are less frequently used than UTF-8.)
1. Python 2 uses str type to store bytes and unicode type to store unicode code points. All strings by default are str type — which is bytes~ And Default encoding is ASCII.
Some have suggested using the tr_TR.utf8
locale. At least on Ubuntu, perhaps related to this bug, setting this locale does not produce the desired result:
import locale
locale.setlocale(locale.LC_ALL, 'tr_TR.utf8')
myCity = u'Isparta İsparta'
print(myCity.lower())
# isparta isparta
So if this bug affects you, as a workaround you could perform this translation yourself:
lower_map = {
ord(u'I'): u'ı',
ord(u'İ'): u'i',
}
myCity = u'Isparta İsparta'
lowerCity = myCity.translate(lower_map)
print(lowerCity)
# ısparta isparta
prints
ısparta isparta
You should use new derived class from unicode from emre's solution
class unicode_tr(unicode):
CHARMAP = {
"to_upper": {
u"ı": u"I",
u"i": u"İ",
},
"to_lower": {
u"I": u"ı",
u"İ": u"i",
}
}
def lower(self):
for key, value in self.CHARMAP.get("to_lower").items():
self = self.replace(key, value)
return self.lower()
def upper(self):
for key, value in self.CHARMAP.get("to_upper").items():
self = self.replace(key, value)
return self.upper()
if __name__ == '__main__':
print unicode_tr("kitap").upper()
print unicode_tr("KİTAP").lower()
Gives
KİTAP
kitap
This must solve your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With