The most common regex suggested for removing special characters seems to be this -
preg_replace( '/[^a-zA-Z0-9]/', '', $string );
The problem is that it also removes non-English characters.
Is there a regex that removes special characters on all languages? Or the only solution is to explicitly match each special character and remove them?
If you want to use any of these as literal characters you can escape special characters with \ to give them their literal character meaning.
Use the replace() method to remove all special characters from a string, e.g. str. replace(/[^a-zA-Z0-9 ]/g, ''); . The replace method will return a new string that doesn't contain any special characters. Copied!
To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).
Similarly, if you String contains many special characters, you can remove all of them by just picking alphanumeric characters e.g. replaceAll("[^a-zA-Z0-9_-]", ""), which will replace anything with empty String except a to z, A to Z, 0 to 9,_ and dash.
You can use instead:
preg_replace('/\P{Xan}+/u', '', $string );
\p{Xan}
is all that is a number or a letter in any alphabet of the unicode table.\P{Xan}
is all that is not a number or a letter. It is a shortcut for [^\p{Xan}]
You can use:
$string = preg_replace( '/[^\p{L}\p{N}]+/u', '', $string );
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With