I have an HTML document that might have <
and >
in some of the attributes. I am trying to extract this and run it through an XSLT, but the XSLT engine errors telling me that <
is not valid inside of an attribute.
I did some digging, and found that it is properly escaped in the source document, but when this is loaded into the DOM via innerHTML
, the DOM is unencoding the attributes. Strangely, it does this for <
and >
, but not some others like &
.
Here is a simple example:
var div = document.createElement('DIV');
div.innerHTML = '<div asdf="<50" fdsa="&50"></div>';
console.log(div.innerHTML)
I'm assuming that the DOM implementation decided that HTML attributes can be less strict than XML attributes, and that this is "working as intended". My question is, can I work around this without writing some horrible regex replacement?
Try XMLSerializer:
var div = document.getElementById('d1');
var pre = document.createElement('pre');
pre.textContent = div.outerHTML;
document.body.appendChild(pre);
pre = document.createElement('pre');
pre.textContent = new XMLSerializer().serializeToString(div);
document.body.appendChild(pre);
<div id="d1" data-foo="a < b && b > c">This is a test</div>
You might need to adapt the XSLT to take account of the XHTML namespace XMLSerializer inserts (at least here in a test with Firefox).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With