Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to normalize HTML in JavaScript or jQuery?

People also ask

What is normalize method in JavaScript?

normalize() is an inbuilt method in javascript which is used to return a Unicode normalisation form of a given input string.

What is Normalisation in HTML?

To improve efficiency, an application will usually normalize text before performing searches or comparisons. Normalization, in this case, means converting the text to use all precomposed or all decomposed characters. There are four normalization forms specified by the Unicode Standard: NFC, NFD, NFKC and NFKD.


JavaScript doesn't actually see a web page in the form of text-based HTML, but rather as a tree structure known as the DOM, or Document Object Model. The order of HTML element attributes in the DOM is not defined (in fact, as Svend comments, they're not even part of the DOM), so the idea of sorting them at the point where JavaScript runs is irrelevant.

I can only guess what you're trying to achieve. If you're trying to do this to improve JavaScript/page performance, most HTML document renderers already presumably put a lot of effort into optimising attribute access, so there's little to be gained there.

If you're trying to order attributes to make gzip compression of pages more effective as they're sent over the wire, understand that JavaScript runs after that point in time. Instead, you may want to look at things that run server-side instead, though it's probably more trouble than it's worth.


Take the HTML and parse into a DOM structure. Then take the DOM structure, and write it back out to HTML. While writing, sort the attributes using any stable sort. Your HTML will now be normalized with regard to attributes.

This is a general way to normalize things. (parse non-normalized data, then write it back out in normalized form).

I'm not sure why you'd want to Normalize HTML, but there you have it. Data is data. ;-)


This is a proof of concept, it can certainly be optimized:

function sort_attributes(a, b) {
  if( a.name == b.name) {
    return 0;
  }

  return (a.name < b.name) ? -1 : 1;
 }

$("#original").find('*').each(function() {
  if (this.attributes.length > 1) {
    var attributes = this.attributes;
    var list = [];

    for(var i =0; i < attributes.length; i++) {
      list.push(attributes[i]);
    }

     list.sort(sort_attributes);

    for(var i = 0; i < list.length; i++) {
      this.removeAttribute(list[i].name, list[i].value);
    }

     for(var i = 0; i < list.length; i++) {
       this.setAttribute(list[i].name, list[i].value);
    }
  }
 });

Same thing for the second element of the diff, $('#different'). Now $('#original').html() and $('#different').html() show HTML code with attributes in the same order.


you can try open HTML tab in firebug, the attributes are always in same order