Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP convert unicode spaces to ascii spaces

So I'm having a problem where I believe what's happening is I'm receiving data that uses some unicode spaces and some ascii spaces, such that certain strings that appear the same are not equivalent, for example, "water resistant" != "water resistant". These strings appear differently in my database, however, with the weird characters you normally see when there's a multibyte character: "water resistantÂ" and " water resistant".

I would like a way to make all spaces be ascii spaces, or if easier, all spaces be multibyte spaces.

I've tried using preg_replace, but then the strings no longer read like valid multibyte strings anymore. (Multibyte characters in the strings will appear as garbage).

preg_replace('/[\pZ\pC]/',' ',$field);

I've also tried using mb_ereg_replace, but it had no effect.

mb_ereg_replace('/[\pZ\pC]/',' ',$field)
like image 503
Kai Avatar asked Nov 20 '13 19:11

Kai


3 Answers

You can find and replace them with standard ascii spaces if you wanted via:

$string = str_replace("\xc2\xa0", "\x20", $string);
like image 171
Rob Evans Avatar answered Sep 20 '22 00:09

Rob Evans


It looks like preg_replace('/[\pZ\pC]/u',' ',$field); works (forgot the u at the end of the regex)

like image 31
Kai Avatar answered Sep 20 '22 00:09

Kai


I think you're looking for utf8_decode($field).

like image 26
Joren Avatar answered Sep 18 '22 00:09

Joren