Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP utf8_encode() converts spaces to non-breaking spaces [closed]

Tags:

php

unicode

utf-8

Perfectly simple: utf8_encode($string) replaces regular spaces with non-breaking spaces ("\u00a0"). I tried filtering the result with str_replace:

str_replace("\u00a0", " ", utf8_encode($string))

But that didn't fix it.

EDIT: Sigh, I'm an idiot. It's not a problem with utf8_encode() either. I thought I was using that function, forgot I disabled it in my code. My data is being run through json_encode() for an AJAX request. Is it a problem with json_encode()? I worry I may be guilty of abusing Stack Overflow. I'll try Googling it.

FINAL EDIT: Problem was with the data itself, which was copied from a Word document into a MySQL table. All the spaces were copied as non-breaking spaces. Sorry for wasting everyone's time.

like image 212
William Avatar asked Aug 16 '11 21:08

William


1 Answers

str_replace("\u00a0", " ", utf8_encode($dat)). But that didn't fix it.

PHP only has byte strings, not native Unicode strings; consequently there is no \u escape and you were asking it literally to convert backslash-letter-u sequences in the input.

To get rid of non-breaking space characters you would have to replace away \xA0 (if done over the ISO-8859-1 data you presumably have before passing to utf8_encode), or \xC2\xA0 (if done after transcoding to UTF-8).

utf8_encode only transcodes ISO-8859-1 to UTF-8, it doesn't touch spaces, so my suspicion is you have non-breaking space characters in your actual data.

like image 82
bobince Avatar answered Sep 25 '22 12:09

bobince