Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I create a string in english letters from another language word?

I need to find a way to rewrite words(translit) from some languages into English language. For example привет (in Russian) sounds like privet (in English).

Meaning and grammar don't matter, but I'd like it to have a more similar sounding. Everything should be in Python, I have diligently looked up on the internet and haven't found a good approach.

For example, something similar to this:

translit("юу со беутифул", "ru") = juu so beutiful

translit("кар", "ru") = kar
like image 250
Alex Avatar asked Jan 04 '23 11:01

Alex


2 Answers

Maybe you should give unidecode a try:

>>> import unidecode
>>> unidecode.unidecode("юу со беутифул")
'iuu so beutiful'
>>> unidecode.unidecode("die größten Probleme")
'die grossten Probleme'
>>> unidecode.unidecode("Avec Éloïse, ils président à l'assemblée")
"Avec Eloise, ils president a l'assemblee"

Install it with pip:

pip3 install unidecode
like image 101
lenz Avatar answered Jan 06 '23 23:01

lenz


Maybe you are already using it; but you can use transliterate package.

Basic install with pip:

pip install transliterate

Then the code

# coding: utf-8

from transliterate import translit

print translit(u"юу со беутифул", 'ru', reversed=True) # juu so beutiful

WITH CUSTOM CLASS

As @Schmuddi propose, you can create your own custom class to handle german special characters, (works only with python 3.X though).

pip3 install transliterate

Then the code:

# coding: utf-8

from transliterate import translit
from transliterate.base import TranslitLanguagePack, registry

class GermanLanguagePack(TranslitLanguagePack):
    language_code = "de"
    language_name = "Deutsch"

    pre_processor_mapping = {
        u"ß": u"ss",
    }

    mapping = (
        u"ÄÖÜäöü",
        u"AOUaou",
    )

registry.register(GermanLanguagePack)

print(translit(u"Die größten Katzenrassen der Welt", "de")) 
#Die grossten Katzenrassen der Welt

Bonus, the French one:

class FrenchLanguagePack(TranslitLanguagePack):
    language_code = "fr"
    language_name = "French"

    pre_processor_mapping = {
        u"œ": u"oe",
        u"Œ": u"oe",
        u"æ": u"ae",
        u"Æ": "AE"
    }


    mapping = (
        u"àâçéèêëïîôùûüÿÀÂÇÉÈÊËÏÎÔÙÛÜŸ",
        u"aaceeeeiiouuuyAACEEEEIIOUUUY"
    )


registry.register(FrenchLanguagePack)

print(translit(u"Avec Éloïse, ils président à l'assemblée", 'fr'))
#Avec Eloise, ils president a l'assemblee

OTHER POSSIBLE SOLUTION

Since transliterate doesn't cover the german langage (yet?), you can use another package to directly translate sentences: py-translate but it uses google translate so you do need an internet connexion.

Basic install with pip:

pip install py-translate

Then your code:

# coding: utf-8

from translate import translator

print translator('ru', 'en', u"юу со беутифул")
print translator('de', 'en', u"Die größten Katzenrassen der Welt")
like image 41
Kruupös Avatar answered Jan 07 '23 01:01

Kruupös