Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading XML CDATA section with ]] in it

I'm coding a RSS reader in Javascript using XMLHttpRequest.

For some RSS Feeds I had no problems but in some cases the xmlDocument.firstChild attribute was always NULL

After trying to see the differences between the XML that worked and the ones that didn't worked I found that the following is the cause of the error.

<item>
    <description>
        <![CDATA[This is a description for a test [...]]]>
    </description>
</item>

Because that in this description tag I have a closing bracket followed by the closing brackets of the CDATA is causing my error, I've made a code with C# using LINQ for the same XML and everything worked.

The closing bracket that is just before the closing brackets of CDATA is causing this strange behaviour. As a test I've tried to read the same XML using C# and LINQ, everything worked okay.

Then I tried to add a space between the closing brackets, like the following

<![CDATA[This is a description for a test [...] ]]>

And it worked!

my javascript code

function LoadRSS() {
    http_request.onreadystatechange = function () { showContent(http_request); };
    http_request.open("GET", "./feeds/test.xml", true);
    http_request.send(false);
}


function showContent(http_request) {
    if (http_request.readyState == 4) {
        if (http_request.status == 200) {
            var parser = new DOMParser();
            var xml_doc = parser.parseFromString(http_request.responseText, "text/xml");
            alert(xml_doc.firstChild)
        }
        else {
            xml_doc = null;
        }
    }
}

Does anyone have faced something similar? Now I really don't know how to proceed any comments and suggestions are welcomed.

like image 559
YasuDevil Avatar asked Jan 11 '11 17:01

YasuDevil


2 Answers

Whatever browser you're using seems to be parsing CDATA sections incorrectly -- only ]]> marks the end of the section, any other square brackets should not affect this at all.

like image 98
casablanca Avatar answered Oct 21 '22 15:10

casablanca


As for "how to proceed"...why not just include a space before the end of the CDATA block always? Do you not have control over the generated XML? If so, you could use JS to:

var xml = http_request.responseText.replace( /\]\]>/g, ' ]]>' );
var xml_doc = parser.parseFromString(xml, "text/xml");
like image 24
Phrogz Avatar answered Oct 21 '22 15:10

Phrogz