Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

innerHTML unencodes < in attributes

I have an HTML document that might have &lt; and &gt; in some of the attributes. I am trying to extract this and run it through an XSLT, but the XSLT engine errors telling me that < is not valid inside of an attribute.

I did some digging, and found that it is properly escaped in the source document, but when this is loaded into the DOM via innerHTML, the DOM is unencoding the attributes. Strangely, it does this for &lt; and &gt;, but not some others like &amp;.

Here is a simple example:

var div = document.createElement('DIV');
div.innerHTML = '<div asdf="&lt;50" fdsa="&amp;50"></div>';
console.log(div.innerHTML)

I'm assuming that the DOM implementation decided that HTML attributes can be less strict than XML attributes, and that this is "working as intended". My question is, can I work around this without writing some horrible regex replacement?

like image 232
murrayju Avatar asked Oct 06 '15 15:10

murrayju


1 Answers

Try XMLSerializer:

var div = document.getElementById('d1');

var pre = document.createElement('pre');
pre.textContent = div.outerHTML;
document.body.appendChild(pre);

pre = document.createElement('pre');
pre.textContent = new XMLSerializer().serializeToString(div);
document.body.appendChild(pre);
<div id="d1" data-foo="a &lt; b &amp;&amp; b &gt; c">This is a test</div>

You might need to adapt the XSLT to take account of the XHTML namespace XMLSerializer inserts (at least here in a test with Firefox).

like image 144
Martin Honnen Avatar answered Oct 30 '22 12:10

Martin Honnen