Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I strip certain html tags out of a string?

I have a <textarea> that a user types something in, and they are allowed to type html. Once they are done typing, the <textarea> changes back to a <span> that contains what they just typed. However, I want to strip out certain tags such as <script>, <div>, etc... before I put it back into the <span>.

like image 741
mottese Avatar asked Aug 09 '12 19:08

mottese


1 Answers

Believe it or not you can (safely) do this with the browser's built in HTML parser. Simply create a new div with document.createElement, toss the contents of the textarea into the div using innerHTML, and presto, you've got a full blown DOM to work with. And no, scripts contained within this div will not be evaluated.

Here's a simple example that strips from an element all tags that do not appear in an ALLOWED_TAGS list.

var ALLOWED_TAGS = ["STRONG", "EM", "BLOCKQUOTE", "Q", "DEL", "INS", "A"];

function sanitize(el) {
    "Remove all tags from element `el' that aren't in the ALLOWED_TAGS list."
    var tags = Array.prototype.slice.apply(el.getElementsByTagName("*"), [0]);
    for (var i = 0; i < tags.length; i++) {
        if (ALLOWED_TAGS.indexOf(tags[i].nodeName) == -1) {
            usurp(tags[i]);
        }
    }
}

function usurp(p) {
    "Replace parent `p' with its children.";
    var last = p;
    for (var i = p.childNodes.length - 1; i >= 0; i--) {
        var e = p.removeChild(p.childNodes[i]);
        p.parentNode.insertBefore(e, last);
        last = e;
    }
    p.parentNode.removeChild(p);
}​

As mentioned, you'll have to create an empty div container to use this. Here's one example application of the technique, a function to sanitize strings. Please note, however, that "sanitize" is at this time a misnomer--it will take a lot more work (cleaning attribute strings and such) before this "sanitizer" will output HTML that is truly safe.

function sanitizeString(string) {
    var div = document.createElement("div");
    div.innerHTML = string;
    sanitize(div);
    return div.innerHTML;
}
like image 165
Joe Taylor Avatar answered Nov 10 '22 17:11

Joe Taylor