Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert Unicode to double ASCII letters in Python (ß -> ss)

Some Unicode characters can also be written as two ASCII letters (e.g.: ß -> ss, å -> aa). Is there any way to convert these in Python, without having a list with all of them?

LATER EDIT:

This kind of conversion is done by a lof of websites, including Stackoverflow (url from this page was converted), and Twitter. I'm curious how they do it.

like image 863
Mihnea Giurgea Avatar asked Jun 28 '12 15:06

Mihnea Giurgea


People also ask

How do I convert Unicode to letter in Python?

In Python, the built-in functions chr() and ord() are used to convert between Unicode code points and characters. A character can also be represented by writing a hexadecimal Unicode code point with \x , \u , or \U in a string literal.

Is Python Unicode or Ascii?

Web content can be written in any of these languages and can also include a variety of emoji symbols. Python's string type uses the Unicode Standard for representing characters, which lets Python programs work with all these different possible characters.

How do you make a string containing Unicode characters in Python?

You have two options to create Unicode string in Python. Either use decode() , or create a new Unicode string with UTF-8 encoding by unicode(). The unicode() method is unicode(string[, encoding, errors]) , its arguments should be 8-bit strings.

How do you print the Unicode of a character in Python?

Use the "\u" escape sequence to print Unicode characters In a string, place "\u" before four hexadecimal digits that represent a Unicode code point. Use print() to print the string.


1 Answers

There are no universal rules.

You could try unidecode module to transliterate Unicode text to ASCII.

like image 104
jfs Avatar answered Sep 17 '22 00:09

jfs