Is there an alternative to PHP's strip_tags()

Question

The strip_tags() documentation tells us that all the tags except the that in the second parameter are stripped. The operation this function performs is totally opposite to its name. It should have been named strip_all_tags_except().

Let's forget about the name and come to what I want to ask. I want the functionality of removing only the tags I mention in the second parameter. ie. I want the following to strip tags <iframe><script><style><embed><object> and allow all others.

my_strip_tags($data,'<iframe><script><style><embed><object>');

It's pretty opposite to what strip_tags() does.

How do I make this happen?

Ry- · Accepted Answer

Updated 2012-06-23; major security flaw.

Here's a class from another project that should do what you're looking for:

final class Filter {
    private function __construct() {}

    const SafeTags = 'a abbr acronym address b bdo big blockquote br caption center cite code col colgroup dd del dfn dir div dl dt em font h1 h2 h3 h4 h5 h6 hr i img ins kbd legend li ol p pre q s samp small span strike strong sub sup table tbody td tfoot th thead tr tt u ul var article aside figure footer header nav section rp rt ruby dialog hgroup mark time';
    const SafeAttributes = 'href src title alt type rowspan colspan lang';
    const URLAttributes  = 'href src';

    public static function HTML($html) {
        # Get array representations of the safe tags and attributes:
        $safeTags = explode(' ', self::SafeTags);
        $safeAttributes = explode(' ', self::SafeAttributes);
        $urlAttributes = explode(' ', self::URLAttributes);

        # Parse the HTML into a document object:
        $dom = new DOMDocument();
        $dom->loadHTML('<div>' . $html . '</div>');

        # Loop through all of the nodes:
        $stack = new SplStack();
        $stack->push($dom->documentElement);

        while($stack->count() > 0) {
            # Get the next element for processing:
            $element = $stack->pop();

            # Add all the element's child nodes to the stack:
            foreach($element->childNodes as $child) {
                if($child instanceof DOMElement) {
                    $stack->push($child);
                }
            }

            # And now, we do the filtering:
            if(!in_array(strtolower($element->nodeName), $safeTags)) {
                # It's not a safe tag; unwrap it:
                while($element->hasChildNodes()) {
                    $element->parentNode->insertBefore($element->firstChild, $element);
                }

                # Finally, delete the offending element:
                $element->parentNode->removeChild($element);
            } else {
                # The tag is safe; now filter its attributes:
                for($i = 0; $i < $element->attributes->length; $i++) {
                    $attribute = $element->attributes->item($i);
                    $name = strtolower($attribute->name);

                    if(!in_array($name, $safeAttributes) || (in_array($name, $urlAttributes) && substr($attribute->value, 0, 7) !== 'http://')) {
                        # Found an unsafe attribute; remove it:
                        $element->removeAttribute($attribute->name);
                        $i--;
                    }
                }
            }
        }

        # Finally, return the safe HTML, minus the DOCTYPE, <html> and <body>:
        $html  = $dom->saveHTML();
        $start = strpos($html, '<div>');
        $end   = strrpos($html, '</div>');

        return substr($html, $start + 5, $end - $start - 5);
    }
}

Your Common Sense · Answer

It shouldn't happen at all.

strip_tags is only usable if used without any parameters. Otherwise you will have an XSS in any tag allowed.

As a matter of fact, your concern should be not only tags but also attributes. So, use some sort of HTML purifier instead.

César Rodríguez · Answer

I usually work with htmLawed lib, you can use it to filter, secure & sanitize HTML

http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/more.htm

Is there an alternative to PHP's strip_tags()

Tags:

php

Tabrez Ahmed

3 Answers

Ry-

Your Common Sense

César Rodríguez

Recent Activity

Donate For Us

Is there an alternative to PHP's strip_tags()

Tags:

php

Tabrez Ahmed

3 Answers

Ry-

Your Common Sense

César Rodríguez

Related questions

Recent Activity

Donate For Us