Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace this Hex chars from string in PHP

I'm generating an XML from data that comes from database (and some JSON feeds).

I'm having some problems with some texts that contains some hex chars that are breaking my XML.

For example, see this screenshot of the error I get from Chrome: XML error

I identified the hex characters that are giving me problems (I believe they're called control characters). And these are:

0x03
0x05
0x16
0x0E

How can I replace those characters with PHP before printing them on my XML output?

Thanks!

like image 214
Guillermo Avatar asked Apr 12 '12 22:04

Guillermo


2 Answers

More than just those characters will break it...

preg_replace('/[\x00-\x1f]/', '?', $s);
like image 171
Ignacio Vazquez-Abrams Avatar answered Sep 21 '22 12:09

Ignacio Vazquez-Abrams


The characters you list are indeed control characters, all placed in the C0 set:

0x03 - ETX  End of Text
0x05 - ENQ  Enquiry
0x0E - SO   Shift Out
0x16 - SYN  Synchronous Idle

You should verify how these characters went into the string. I can not really suggest to remove them (if you plan to remove them, use at least a substitution character, do not just remove them), but being a bit more conservative here as those aren't invalid unicode, just convert them to numeric entities (this has been successfully done here, too):

$pairs = array(
    "\x03" => "",
    "\x05" => "",
    "\x0E" => "",
    "\x16" => "",
);
$xml = strtr($xml, $pairs);

Hope this is helpful.

like image 27
hakre Avatar answered Sep 21 '22 12:09

hakre