Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for removing special characters on a multilingual string

The most common regex suggested for removing special characters seems to be this -

preg_replace( '/[^a-zA-Z0-9]/', '', $string );

The problem is that it also removes non-English characters.

Is there a regex that removes special characters on all languages? Or the only solution is to explicitly match each special character and remove them?

like image 381
A.Jesin Avatar asked Apr 29 '14 18:04

A.Jesin


People also ask

How do you skip special characters in regex?

If you want to use any of these as literal characters you can escape special characters with \ to give them their literal character meaning.

How do I remove all special characters from a string?

Use the replace() method to remove all special characters from a string, e.g. str. replace(/[^a-zA-Z0-9 ]/g, ''); . The replace method will return a new string that doesn't contain any special characters. Copied!

How do you handle special characters in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

How can I remove special characters from a string in Java without regex?

Similarly, if you String contains many special characters, you can remove all of them by just picking alphanumeric characters e.g. replaceAll("[^a-zA-Z0-9_-]", ""), which will replace anything with empty String except a to z, A to Z, 0 to 9,_ and dash.


2 Answers

You can use instead:

preg_replace('/\P{Xan}+/u', '', $string );

\p{Xan} is all that is a number or a letter in any alphabet of the unicode table.
\P{Xan} is all that is not a number or a letter. It is a shortcut for [^\p{Xan}]

like image 129
Casimir et Hippolyte Avatar answered Sep 23 '22 14:09

Casimir et Hippolyte


You can use:

$string = preg_replace( '/[^\p{L}\p{N}]+/u', '', $string );
like image 28
anubhava Avatar answered Sep 22 '22 14:09

anubhava