Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the current locale's alphabet in Python 3?

In Python 2 you could do the following to get the current locale's character set:

import string
print string.letters

However, in Python 3 the string module's locale-dependent constants (e.g. string.letters, string.lowercase, string.uppercase, etc.) were removed.


How can I get the current locale's character set using Python 3?

like image 791
jinscoe123 Avatar asked Aug 27 '18 19:08

jinscoe123


People also ask

How do I get only the alphabet of a string in Python?

You can use the regular expression 'r[^a-zA-Z]' to match with non-alphabet characters in the string and replace them with an empty string using the re. sub() function. The resulting string will contain only letters.

Is there a function for the alphabet in Python?

The isalpha() function is a built-in function used for string handling in python, which checks if the single input character is an alphabet or if all the characters in the input string are alphabets.


1 Answers

You can get the exemplar characters for each locale using the pyicu module:

import locale
from icu import LocaleData

default, encoding = locale.getdefaultlocale()
languages = [default] + ['en_US', 'fr_FR', 'es_ES']

for language in languages:
    data = LocaleData(language)
    alphabet = data.getExemplarSet()
    print(language, alphabet)

Output

pt_BR [a-zà-ãçéêíò-õú]
en_US [a-z]
fr_FR [a-zàâæ-ëîïôùûüÿœ]
es_ES [a-záéíñóúü]

To get the current locale is enough to do:

default, _ = locale.getdefaultlocale()
data = LocaleData(default)
alphabet = data.getExemplarSet()
print(default, alphabet)
like image 65
Dani Mesejo Avatar answered Oct 05 '22 02:10

Dani Mesejo