I need to find a way to rewrite words(translit) from some languages into English language. For example привет
(in Russian) sounds like privet
(in English).
Meaning and grammar don't matter, but I'd like it to have a more similar sounding. Everything should be in Python, I have diligently looked up on the internet and haven't found a good approach.
For example, something similar to this:
translit("юу со беутифул", "ru") = juu so beutiful
translit("кар", "ru") = kar
Maybe you should give unidecode a try:
>>> import unidecode
>>> unidecode.unidecode("юу со беутифул")
'iuu so beutiful'
>>> unidecode.unidecode("die größten Probleme")
'die grossten Probleme'
>>> unidecode.unidecode("Avec Éloïse, ils président à l'assemblée")
"Avec Eloise, ils president a l'assemblee"
Install it with pip
:
pip3 install unidecode
Maybe you are already using it; but you can use transliterate
package.
Basic install with pip:
pip install transliterate
Then the code
# coding: utf-8
from transliterate import translit
print translit(u"юу со беутифул", 'ru', reversed=True) # juu so beutiful
WITH CUSTOM CLASS
As @Schmuddi propose, you can create your own custom class to handle german special characters, (works only with python 3.X though).
pip3 install transliterate
Then the code:
# coding: utf-8
from transliterate import translit
from transliterate.base import TranslitLanguagePack, registry
class GermanLanguagePack(TranslitLanguagePack):
language_code = "de"
language_name = "Deutsch"
pre_processor_mapping = {
u"ß": u"ss",
}
mapping = (
u"ÄÖÜäöü",
u"AOUaou",
)
registry.register(GermanLanguagePack)
print(translit(u"Die größten Katzenrassen der Welt", "de"))
#Die grossten Katzenrassen der Welt
Bonus, the French one:
class FrenchLanguagePack(TranslitLanguagePack):
language_code = "fr"
language_name = "French"
pre_processor_mapping = {
u"œ": u"oe",
u"Œ": u"oe",
u"æ": u"ae",
u"Æ": "AE"
}
mapping = (
u"àâçéèêëïîôùûüÿÀÂÇÉÈÊËÏÎÔÙÛÜŸ",
u"aaceeeeiiouuuyAACEEEEIIOUUUY"
)
registry.register(FrenchLanguagePack)
print(translit(u"Avec Éloïse, ils président à l'assemblée", 'fr'))
#Avec Eloise, ils president a l'assemblee
OTHER POSSIBLE SOLUTION
Since transliterate doesn't cover the german langage (yet?), you can use another package to directly translate sentences: py-translate
but it uses google translate so you do need an internet connexion.
Basic install with pip:
pip install py-translate
Then your code:
# coding: utf-8
from translate import translator
print translator('ru', 'en', u"юу со беутифул")
print translator('de', 'en', u"Die größten Katzenrassen der Welt")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With