Convert fancy/artistic unicode text to ASCII

Question

I have a unicode string like "𝖙𝖍𝖚𝖌 𝖑𝖎𝖋𝖊" and would like to convert it to the ASCII form "thug life".

I know I can achieve this in Python by

import unidecode
print(unidecode.unidecode('𝖙𝖍𝖚𝖌 𝖑𝖎𝖋𝖊'))
// thug life

However, this would asciify also other unicode characters (such as Chinese/Japanese characters, emojis, accented characters, etc.), which I want to preserve.

Is there a way to detect these type of "artistic" unicode characters?

Some more examples:

𝓽𝓱𝓾𝓰 𝓵𝓲𝓯𝓮

𝓉𝒽𝓊𝑔 𝓁𝒾𝒻𝑒

𝕥𝕙𝕦𝕘 𝕝𝕚𝕗𝕖

ｔｈｕｇｌｉｆｅ

Thanks for your help!

JosefZ · Accepted Answer

import unicodedata
strings = [
  '𝖙𝖍𝖚𝖌 𝖑𝖎𝖋𝖊',
  '𝓽𝓱𝓾𝓰 𝓵𝓲𝓯𝓮',
  '𝓉𝒽𝓊𝑔 𝓁𝒾𝒻𝑒',
  '𝕥𝕙𝕦𝕘 𝕝𝕚𝕗𝕖',
  'ｔｈｕｇ ｌｉｆｅ']
for x in strings:
  print(unicodedata.normalize( 'NFKC', x), x)

Output: .\62803325.py

thug life 𝖙𝖍𝖚𝖌 𝖑𝖎𝖋𝖊
thug life 𝓽𝓱𝓾𝓰 𝓵𝓲𝓯𝓮
thug life 𝓉𝒽𝓊𝑔 𝓁𝒾𝒻𝑒
thug life 𝕥𝕙𝕦𝕘 𝕝𝕚𝕗𝕖
thug life ｔｈｕｇ ｌｉｆｅ

Resources:

unicodedata — Unicode Database
Normalization forms for Unicode text

Convert fancy/artistic unicode text to ASCII

Tags:

python

python-3.x

unicode

ascii

Martin

1 Answers

JosefZ

Recent Activity

Donate For Us

Convert fancy/artistic unicode text to ASCII

Tags:

python

python-3.x

unicode

ascii

Martin

1 Answers

JosefZ

Related questions

Recent Activity

Donate For Us