Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cleaning HTML using JavaScript

In an application I am developing, the user enters HTML in a text box, to edit an element on his page. At this stage, the user can add any sort of content, even broken HTML, and some text nodes.

To make sure I get somewhat clean code, I do this

var s = document.createElement('div');
s.innerHTML = content;
// loop over each node in s, and if text node is found, wrap in span.
content = s.innerHTML

The problem with this snippet is that is the content was a <TD>Text</TD>, the result I get is Text, since there cannot be a TD in a DIV.

Is there a fix to get valid content, in all cases?

like image 528
Amit Avatar asked Nov 13 '22 12:11

Amit


1 Answers

The problem with doing it using the DOM is that you don't really want fully corrected html, because you are adding the condition that the html is allowed to be a snippet. You want some malformed html corrected, and some not.

Googling a bit threw up this jQuery plugin : http://www.davidpirek.com/blog/html-beautifier-jquery-plugin

but I can't vouch for it.

I would probably agree with Graham and suggest HTML Tidy since it's mature and fast even if you have to wait for the response.

like image 64
Tom Elmore Avatar answered Nov 16 '22 04:11

Tom Elmore