I'm generating an XML from data that comes from database (and some JSON feeds).
I'm having some problems with some texts that contains some hex chars that are breaking my XML.
For example, see this screenshot of the error I get from Chrome:
I identified the hex characters that are giving me problems (I believe they're called control characters). And these are:
0x03
0x05
0x16
0x0E
How can I replace those characters with PHP before printing them on my XML output?
Thanks!
More than just those characters will break it...
preg_replace('/[\x00-\x1f]/', '?', $s);
The characters you list are indeed control characters, all placed in the C0 set:
0x03 - ETX End of Text
0x05 - ENQ Enquiry
0x0E - SO Shift Out
0x16 - SYN Synchronous Idle
You should verify how these characters went into the string. I can not really suggest to remove them (if you plan to remove them, use at least a substitution character, do not just remove them), but being a bit more conservative here as those aren't invalid unicode, just convert them to numeric entities (this has been successfully done here, too):
$pairs = array(
"\x03" => "",
"\x05" => "",
"\x0E" => "",
"\x16" => "",
);
$xml = strtr($xml, $pairs);
Hope this is helpful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With