I have a Unicode string in Python, and I would like to remove all the accents (diacritics).
I found on the web an elegant way to do this (in Java):
Do I need to install a library such as pyICU or is this possible with just the Python standard library? And what about python 3?
Important note: I would like to avoid code with an explicit mapping from accented characters to their non-accented counterpart.
Unidecode is the correct answer for this. It transliterates any unicode string into the closest possible representation in ascii text.
Example:
>>> from unidecode import unidecode
>>> unidecode('kožušček')
'kozuscek'
>>> unidecode('北亰')
'Bei Jing '
>>> unidecode('François')
'Francois'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With