I create an XML based on user input. One of the xml nodes has a CDATA section. If one of the characters inserted in the CDATA section is 'special' (a control character I think) then the entire xml becomes invalid.
Example:
$dom = new DOMDocument('1.0', 'utf-8');
$dom->appendChild($dom->createElement('root'))
->appendChild($dom->createCDATASection(
"This is some text with a SOH char \x01."
));
$test = new DOMDocument;
$test->loadXml($dom->saveXML());
echo $test->saveXml();
will give
Warning: DOMDocument::loadXML(): CData section not finished
This is some text with a SOH cha in Entity, line: 2 in /newfile.php on line 17
Warning: DOMDocument::loadXML(): PCDATA invalid Char value 1 in Entity, line: 2 in /newfile.php on line 17
Warning: DOMDocument::loadXML(): Sequence ']]>' not allowed in content in Entity, line: 2 in /newfile.php on line 17
Warning: DOMDocument::loadXML(): Sequence ']]>' not allowed in content in Entity, line: 2 in /newfile.php on line 17
Warning: DOMDocument::loadXML(): internal errorExtra content at the end of the document in Entity, line: 2 in /newfile.php on line 17
<?xml version="1.0"?>
Is there a good way in php do make sure the CDATA section is valid ?
The allowed range of characters for CDATA section is
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
So you have to sanitize your string to include only those characters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With