I have a problem with converting uppercase letters with umlauts to lowercase ones.
print("ÄÖÜAOU".lower())
The A, O and the U gets converted properly but the Ä,Ö and Ü stays uppercase. Any ideas?
First problem is fixed with the .decode('utf-8') but I still have a second one:
# -*- coding: utf-8 -*-
original_message="ÄÜ".decode('utf-8')
original_message=original_message.lower()
original_message=original_message.replace("ä", "x")
print(original_message)
Traceback (most recent call last): File "Untitled.py", line 4, in original_message=original_message.replace("ä", "x") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)
Python String lower() Method The lower() method returns a string where all characters are lower case. Symbols and Numbers are ignored.
In Python, lower() is a built-in method used for string handling. The lower() method returns the lowercased string from the given string. It converts all uppercase characters to lowercase.
lower() is a built-in Python method primarily used for string handling. The . lower() method takes no arguments and returns the lowercased strings from the given string by converting each uppercase character to lowercase. If there are no uppercase characters in the given string, it returns the original string.
You'll need to mark it as a unicode string unless you're working with plain ASCII;
> print(u"ÄÖÜAOU".lower())
äöüaou
It works the same when working with variables, it all depends on the type assigned to the variable to begin with.
> olle = "ÅÄÖABC"
> print(olle.lower())
ÅÄÖabc
> olle = u"ÅÄÖABC"
> print(olle.lower())
åäöabc
You are dealing with encoded strings, not with unicode text.
The .lower()
method of byte strings can only deal with ASCII values. Decode your string to Unicode or use a unicode
literal (u''
), then lowercase:
>>> print u"\xc4AOU".lower()
äaou
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With