Is there any lib that can replace special characters to ASCII equivalents, like:
"Cześć"
to:
"Czesc"
I can of course create map:
{'ś':'s', 'ć': 'c'}
and use some replace function. But I don't want to hardcode all equivalents into my program, if there is some function that already does that.
Special Characters(32–47 / 58–64 / 91–96 / 123–126): Special characters include all printable characters that are neither letters nor numbers. These include punctuation or technical, mathematical characters.
If you have the ASCII code for a number you can either subtract 30h or mask off the upper four bits and you will be left with the number itself. Likewise you can generate the ASCII code from the number by adding 30h or by ORing with 30h.
Below are the implementation of both methods: Using ASCII values: ASCII value of uppercase alphabets – 65 to 90. ASCII value of lowercase alphabets – 97 to 122.
You must first convert the character to its ASCII value. In LiveCode, this is done with the charToNum function. Converting a number to the corresponding character is done with the numToChar function. The first of these statements converts a number to a character; the second converts a character to its ASCII value.
#!/usr/bin/env python # -*- coding: utf-8 -*- import unicodedata text = u'Cześć' print unicodedata.normalize('NFD', text).encode('ascii', 'ignore')
You can get most of the way by doing:
import unicodedata def strip_accents(text): return ''.join(c for c in unicodedata.normalize('NFKD', text) if unicodedata.category(c) != 'Mn')
Unfortunately, there exist accented Latin letters that cannot be decomposed into an ASCII letter + combining marks. You'll have to handle them manually. These include:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With