Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I change extended latin characters to their unaccented ASCII equivalents?

Tags:

regex

perl

I need a generic transliteration or substitution regex that will map extended latin characters to similar looking ASCII characters, and all other extended characters to '' (empty string) so that...

  • é becomes e

  • ê becomes e

  • á becomes a

  • ç becomes c

  • Ď becomes D

and so on, but things like ‡ or Ω or ‰ just get striped away.

like image 583
rwired Avatar asked Jan 16 '09 10:01

rwired


People also ask

How do you get non ASCII characters on a keyboard?

This is easily done on a Windows platform: type the decimal ascii code (on the numeric keypad only) while holding down the ALT key, and the corresponding character is entered. For example, Alt-132 gives you a lowercase "a" with an umlaut.

How to make ASCII characters on laptop?

Inserting ASCII characters To insert an ASCII character, press and hold down ALT while typing the character code. For example, to insert the degree (º) symbol, press and hold down ALT while typing 0176 on the numeric keypad. You must use the numeric keypad to type the numbers, and not the keyboard.

What is extended ascii used for?

Extended ASCII represents both control characters and printable characters. Control characters are used to perform actions rather than to display a printable character on screen. Easily understood examples include 'Escape', 'Backspace' and 'Delete'.

What is extended character?

Extended characters are those which are not in the standard ASCII character set, which uses 7-bit characters and thus has values 0 to 127. ASCII Codes 0 to 31 and 127 are non-printing control characters, while codes 32 to 126 match the keys on a US keyboard ("a", "A", etc.).


1 Answers

Use Unicode::Normalize to get the NFD($str). In this form all the characters with diacritics will be turned into a base character followed by a combining diacritic character. Then simply remove all the non-ASCII characters.

like image 185
bobince Avatar answered Nov 15 '22 08:11

bobince