I am generating XML using PHP library as below:
$dom = new DOMDocument("1.0","utf-8");
Doing above results in a page which shows a message on top of the output.
This page contains the following errors: error on line 16 at column 274505: PCDATA invalid Char value 27 Below is a rendering of the page up to the first error.
I have tried rectifying using Tidy library.. used iconv to get the chinese character in UTF-8.
A useful function to get rid of that error is suggested on this website. http://www.phpwact.org/php/i18n/charsets#common_problem_areas_with_utf-8
When you put utf-8 encoded strings in a XML document you should remember that not all utf-8 valid chars are accepted in a XML document http://www.w3.org/TR/REC-xml/#charsets
So you should strip away the unwanted chars, else you’ll have an XML fatal parsing error such as above
function utf8_for_xml($string)
{
return preg_replace ('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $string);
}
Hope that saves someone else some time..
Prashant is absolutely right. You can also strip away invalid characters in Javascript by doing:
function utf8_for_xml(inputStr) {
return inputStr.replace(/[^\x09\x0A\x0D\x20-\xFF\x85\xA0-\uD7FF\uE000-\uFDCF\uFDE0-\uFFFD]/gm, '');
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With