I have written some code that takes a string of html and cleans away any ugly HTML from it using jQuery (see an early prototype in this SO question). It works pretty well, but I stumbled on an issue:
When using .append() to wrap the html in a div, all script elements in the code are evaluated and run (see this SO answer for an explanation why this happens). I don't want this, I really just want them to be removed, but I can handle that later myself as long as they are not run.
I am using this code:
var wrapper = $('<div/>').append($(html));
I tried to do it this way instead:
var wrapper = $('<div>' + html + '</div>');
But that just brings forth the "Access denied" error in IE that the append() function fixes (see the answer I referenced above).
I think I might be able to rewrite my code to not require a wrapper around the html, but I am not sure, and I'd like to know if it is possible to append html without running scripts in it, anyway.
How do I wrap a piece of unknown html without running scripts inside it, preferably removing them altogether?
Should I throw jQuery out the window and do this with plain JavaScript and DOM manipulation instead? Would that help?
I am not trying to put some kind of security layer on the client side. I am very much aware that it would be pointless.
James suggested that I should filter out the script elements, but look at these two examples (the original first and the James' suggestion):
jQuery("<p/>").append("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there")
keeps the text nodes but writes gnu!
jQuery("<p/>").append(jQuery("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there").not('script'))`
Doesn't write gnu!, but also loses the text nodes.
James has updated his answer and I have accepted it. See my latest comment to his answer, though.
How about removing the scripts first?
var wrapper = $('<div/>').append($(html).not('script'));
Assuming script elements in the html are not nested in other elements:
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).children().remove('script');
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).find('script').remove();
This works for the case where html is just text and where html has text outside any elements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With