Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP PREG Regex: What does "\W" mean when using the UTF-8 modifier?

I know that in normal php regex (ASCII mode) "\w" (word) means "letter, number, and _". But what does it mean when you are using multibyte regex with the "u" modifier?

preg_replace('/\W/u', '', $string);
like image 762
Xeoncross Avatar asked Sep 17 '25 21:09

Xeoncross


1 Answers

Anything that isn't a letter, number or underscore.

So, in terms of Unicode character classes, \W is equivalent to every character that are not in the L or N character classes and that aren't the underscore character.

If you were to write it using the \p{xx} syntax, it would be equivalent to [^\p{LN}_].

like image 129
Welbog Avatar answered Sep 20 '25 11:09

Welbog