Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are the PHP preg_functions multibyte safe?

There are no multibyte 'preg' functions available in PHP, so does that mean the default preg_functions are all mb safe? Couldn't find any mention in the php documentation.

like image 372
Spoonface Avatar asked Nov 19 '09 20:11

Spoonface


1 Answers

pcre supports utf8 out of the box, see documentation for the 'u' modifier.

Illustration (\xC3\xA4 is the utf8 encoding for the german letter "ä")

  echo preg_replace('~\w~', '@', "a\xC3\xA4b"); 

this echoes "@@¤@" because "\xC3" and "\xA4" were treated as distinct symbols

  echo preg_replace('~\w~u', '@', "a\xC3\xA4b"); 

(note the 'u') prints "@@@" because "\xC3\xA4" were treated as a single letter.

like image 137
user187291 Avatar answered Sep 28 '22 01:09

user187291