Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace language specific characters in python with English letters

Is there any way in Python 3 to replace general language specific characters for English letters?
For example, I've got function get_city(IP), that returns city name connected with given IP. It connects to external database, so I can't change the way it encodes, I am just getting value from database.
I would like to do something like:

city = "České Budějovice"
city = clear_name(city)
print(city) #should return "Ceske Budejovice"

Here I used Czech language, but in general it should work on any non Asian langauge.

like image 472
Photon Light Avatar asked Dec 04 '22 23:12

Photon Light


1 Answers

Try unidecode:

# coding=utf-8
from unidecode import unidecode

city = "České Budějovice"
print(unidecode(city))

Prints Ceske Budejovice as desired (assuming your post has a typo).

Note: if you're using Python 2.x, you'll need to decode the string before passing it to unidecode, e.g. unidecode(city.decode('utf-8'))

like image 79
asongtoruin Avatar answered Dec 09 '22 16:12

asongtoruin