How to remove only html tags in a string using javascript

Tags:

I want to remove html tags from given string using javascript. I looked into current approaches but there are some unsolved problems occured with them.

Current solutions

(1) Using javascript, creating virtual div tag and get the text

  function remove_tags(html)
  {
       var tmp = document.createElement("DIV");
       tmp.innerHTML = html; 
       return tmp.textContent||tmp.innerText; 
  }

(2) Using regex

  function remove_tags(html)
  {
       return html.replace(/<(?:.|\n)*?>/gm, '');
  }

(3) Using JQuery

  function remove_tags(html)
  {
       return jQuery(html).text();
  }

These three solutions are working correctly, but if the string is like this

  <div> hello <hi all !> </div>

stripped string is like hello . But I need only remove html tags only. like hello <hi all !>

Edited: Background is, I want to remove all the user input html tags for a particular text area. But I want to allow users to enter <hi all> kind of text. In current approach, its remove any content which include within <>.

425

asked Jun 18 '13 08:06

cp100

1 Answers

Using a regex might not be a problem if you consider a different approach. For instance, looking for all tags, and then checking to see if the tag name matches a list of defined, valid HTML tag names:

var protos = document.body.constructor === window.HTMLBodyElement;
    validHTMLTags  =/^(?:a|abbr|acronym|address|applet|area|article|aside|audio|b|base|basefont|bdi|bdo|bgsound|big|blink|blockquote|body|br|button|canvas|caption|center|cite|code|col|colgroup|data|datalist|dd|del|details|dfn|dir|div|dl|dt|em|embed|fieldset|figcaption|figure|font|footer|form|frame|frameset|h1|h2|h3|h4|h5|h6|head|header|hgroup|hr|html|i|iframe|img|input|ins|isindex|kbd|keygen|label|legend|li|link|listing|main|map|mark|marquee|menu|menuitem|meta|meter|nav|nobr|noframes|noscript|object|ol|optgroup|option|output|p|param|plaintext|pre|progress|q|rp|rt|ruby|s|samp|script|section|select|small|source|spacer|span|strike|strong|style|sub|summary|sup|table|tbody|td|textarea|tfoot|th|thead|time|title|tr|track|tt|u|ul|var|video|wbr|xmp)$/i;

function sanitize(txt) {
    var // This regex normalises anything between quotes
        normaliseQuotes = /=(["'])(?=[^\1]*[<>])[^\1]*\1/g,
        normaliseFn = function ($0, q, sym) { 
            return $0.replace(/</g, '&lt;').replace(/>/g, '&gt;'); 
        },
        replaceInvalid = function ($0, tag, off, txt) {
            var 
                // Is it a valid tag?
                invalidTag = protos && 
                    document.createElement(tag) instanceof HTMLUnknownElement
                    || !validHTMLTags.test(tag),

                // Is the tag complete?
                isComplete = txt.slice(off+1).search(/^[^<]+>/) > -1;

            return invalidTag || !isComplete ? '&lt;' + tag : $0;
        };

    txt = txt.replace(normaliseQuotes, normaliseFn)
             .replace(/<(\w+)/g, replaceInvalid);

    var tmp = document.createElement("DIV");
    tmp.innerHTML = txt;

    return "textContent" in tmp ? tmp.textContent : tmp.innerHTML;
}

Working Demo: http://jsfiddle.net/m9vZg/3/

This works because browsers parse '>' as text if it isn't part of a matching '<' opening tag. It doesn't suffer the same problems as trying to parse HTML tags using a regular expression, because you're only looking for the opening delimiter and the tag name, everything else is irrelevant.

It's also future proof: the WebIDL specification tells vendors how to implement prototypes for HTML elements, so we try and create a HTML element from the current matching tag. If the element is an instance of HTMLUnknownElement, we know that it's not a valid HTML tag. The validHTMLTags regular expression defines a list of HTML tags for older browsers, such as IE 6 and 7, that do not implement these prototypes.

118

answered Oct 23 '22 19:10

Andy E

Related questions
                            
                                Can I have multiple instances of a RequireJS Module?
                            
                                Questions on Javascript hoisting
                            
                                How to prevent objects inside an SVG drawing to be clipped at the bounds of the SVG element in chrome?
                            
                                using facebook batch request javascript api
                            
                                How to attach a function to popover dismiss event (Twitter Bootstrap)
                            
                                How is Object.prototype.toString.apply(value) different from value.toString()?
                            
                                linking nodes of variable radius with arrows
                            
                                Installing npm module results in command not found
                            
                                How do you use twitter bootstrap button with jquery?
                            
                                event.keyCode not working in Firefox
                            
                                bounding box appearance - controls customization with fabricjs
                            
                                Persian Calender in MVC , Asp.net
                            
                                Different display value for selecte text using select2.js
                            
                                how to create a ActiveXObject with node.js?
                            
                                Writing a function that "solves" an equation
                            
                                How to remove text between two elements with jQuery
                            
                                What's the different between style.left and element.offsetLeft
                            
                                Insert newline into javascript string
                            
                                Very simple javascript doesn't work at all [duplicate]
                            
                                AngularJS ng-repeat and form validation

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove only html tags in a string using javascript

Tags:

javascript

html

jquery

string

cp100

People also ask

1 Answers

Andy E

Recent Activity

Donate For Us