Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extended ASCII Character Conversion

How can I convert Extended ASCII characters such as: "æ, ö or ç" into non-extended ASCII characters (a,o,c) using python? The way it works should be that if it takes "A, Æ ,Ä" as input, It returns A for all of them.

like image 577
madprogramer Avatar asked Mar 28 '13 23:03

madprogramer


People also ask

How do I get extended ASCII characters?

On a standard 101 keyboard, special extended ASCII characters such as é or ß can be typed by holding the ALT key and typing the corresponding 4 digit ASCII code. For example é is typed by holding the ALT key and typing 0233 on the keypad.

How do I convert ASCII characters?

Here are few methods in different programming languages to print ASCII value of a given character : Python code using ord function : ord() : It converts the given string of length one, returns an integer representing the Unicode code point of the character. For example, ord('a') returns the integer 97.

Does UTF-8 support extended ASCII?

UTF-8 extends the ASCII character set to use 8-bit code points, which allows for up to 256 different characters. This means that UTF-8 can represent all of the printable ASCII characters, as well as the non-printable characters.

Is extended ASCII the same as UTF-8?

Extended-ASCII, with numeric code points between 128 to 255 decimal (80 to FF hexadecimal, 1000 0000 to 1111 1111 binary), collides with UTF-8 because it has the leftmost bit set to one, and this tells the interpreter that one (at least one) additional byte is required to form the character.


1 Answers

Unidecode might be of use to you.

Python 3.2.3 (default, Jun  8 2012, 05:36:09) 
[GCC 4.7.0 20120507 (Red Hat 4.7.0-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from unidecode import unidecode
>>> unidecode("æ, ö or ç")
'ae, o or c'
like image 110
riamse Avatar answered Oct 13 '22 13:10

riamse