I want to remove all non-alphabetic character from a string. The problem is that I don't know the letter range because it is UTF8 string.
It can be ENGLISH, ՀԱՅԵՐԵՆ, ქართული, УКРАЇНСЬКИЙ, РУССКИЙ
I usually do something like this:
$str = preg_replace('/[^a-zA-Z]/', '', $str);
or
$str = preg_replace('/[^\w]/u', '', $str);
but they both clear foreign characters.
Any ideas?
Non-alphanumeric characters can be remove by using preg_replace() function. This function perform regular expression search and replace. The function preg_replace() searches for string specified by pattern and replaces pattern with replacement if found.
To automatically find and delete non-UTF-8 characters, we're going to use the iconv command. It is used in Linux systems to convert text from one character encoding to another.
To remove all non-alphanumeric characters from a string, call the replace() method, passing it a regular expression that matches all non-alphanumeric characters as the first parameter and an empty string as the second. The replace method returns a new string with all matches replaced. Copied!
Use the Unicode character properties:
$str = preg_replace('/\P{L}+/u', '', $str);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With