Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python string.title() issue with German umlauts

I got a strange behavior of the Python string.title() function if the string contains German umlauts (üöä). Then, not only the first character of the string is capitalized, but as well the character following the umlaut.

# -*- coding: utf-8 -*-
a = "müller"
print a.title()
# this returns >MüLler< , not >Müller< as expected

Tried to fix by setting locale to German UTF-8 charset, but no success:

import locale
locale.setlocale(locale.LC_ALL, 'de_DE.UTF-8')
a="müller"
print a.title()
# same value >MüLler<

Any ideas to prevent the capitalization after the umlaut?
My Python version is 2.6.6 on debian linux

like image 523
cazzler Avatar asked Dec 20 '25 20:12

cazzler


1 Answers

Decode your string to Unicode, then use unicode.title():

>>> a = "müller"
>>> a.decode('utf8').title()
u'M\xfcller'
>>> print a.decode('utf8').title()
Müller

You can always encode to UTF-8 again later on.

like image 86
Martijn Pieters Avatar answered Dec 23 '25 10:12

Martijn Pieters



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!