Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restore exact innerHTML to DOM

I'd like to save the html string of the DOM, and later restore it to be exactly the same. The code looks something like this:

var stringified = document.documentElement.innerHTML
// later, after serializing and deserializing
document.documentElement.innerHTML = stringified

This works when everything is perfect, but when the DOM is not w3c-comliant, there's a problem. The first line works fine, stringified matches the DOM exactly. But when I restore from the (non-w3c-compliant) stringified, the browser does some magic and the resulting DOM is not the same as it was originally.

For example, if my original DOM looks like

<p><div></div></p>

then the final DOM will look like

<p></p><div></div><p></p>

since div elements are not allowed to be inside p elements. Is there some way I can get the browser to use the same html parsing that it does on page load and accept broken html as-is?

Why is the html broken in the first place? The DOM is not controlled by me.

Here's a jsfiddle to show the behavior http://jsfiddle.net/b2x7rnfm/5/. Open your console.

<body>
    <div id="asdf"><p id="outer"></p></div>
    <script type="text/javascript">
        var insert = document.createElement('div');
        var text = document.createTextNode('ladygaga');
        insert.appendChild(text);
        document.getElementById('outer').appendChild(insert);
        var e = document.getElementById('asdf')
        console.log(e.innerHTML);
        e.innerHTML = e.innerHTML;
        console.log(e.innerHTML); // This is different than 2 lines above!!
    </script>
</body>
like image 951
Sergiu Toarca Avatar asked Jun 26 '15 16:06

Sergiu Toarca


People also ask

How do I change innerHTML in DOM browser?

According to the official documentation, dangerouslySetInnerHTML is React's replacement for using innerHTML in the browser DOM. This means that if in React if you have to set HTML programmatically or from an external source, you would have to use dangerouslySetInnerHTML instead of traditional innerHTML in Javascript.

How do I get an element from innerHTML?

How it works. First, get the <ul> element with the id menu using the getElementById() method. Second, create a new <li> element and add it to the <ul> element using the createElement() and appendChild() methods. Third, get the HTML of the <ul> element using the innerHTML property of the <ul> element.

Why you shouldn't use innerHTML?

'innerHTML' Presents a Security Risk The use of innerHTML creates a potential security risk for your website. Malicious users can use cross-site scripting (XSS) to add malicious client-side scripts that steal private user information stored in session cookies.

Does innerHTML overwrite?

Yes, setting the . innerHTML properties overwrites any previous value for that property.


2 Answers

If you need to be able to save and restore an invalid HTML structure, you could do it by way of XML. The code which follows comes from this fiddle.

To save, you create a new XML document to which you add the nodes you want to serialize:

var asdf = document.getElementById("asdf");
var outer = document.getElementById("outer");
var add = document.getElementById("add");
var save = document.getElementById("save");
var restore = document.getElementById("restore");

var saved = undefined;
save.addEventListener("click", function () {
  if (saved !== undefined)
    return; /// Do not overwrite

  // Create a fake document with a single top-level element, as 
  // required by XML.    
  var parser = new DOMParser();
  var doc = parser.parseFromString("<top/>", "text/xml");

  // We could skip the cloning and just move the nodes to the XML
  // document. This would have the effect of saving and removing 
  // at the same time but I wanted to show what saving while 
  // preserving the data would look like    
  var clone = asdf.cloneNode(true);
  var top = doc.firstChild;
  var child = asdf.firstChild;
  while (child) {
    top.appendChild(child);
    child = asdf.firstChild;
  }
  saved = top.innerHTML;
  console.log("saved as: ", saved);

  // Perform the removal here.
  asdf.innerHTML = "";
});

To restore, you create an XML document to deserialize what you saved and then add the nodes to your document:

restore.addEventListener("click", function () {
  if (saved === undefined)
      return; // Don't restore undefined data!

  // We parse the XML we saved.
  var parser = new DOMParser();
  var doc = parser.parseFromString("<top>" + saved + "</top>", "text/xml");
  var top = doc.firstChild;

  var child = top.firstChild;
  while (child) {
    asdf.appendChild(child);
    // Remove the extra junk added by the XML parser.
    child.removeAttribute("xmlns");
    child = top.firstChild;
  }
  saved = undefined;
  console.log("inner html after restore", asdf.innerHTML);
});

Using the fiddle, you can:

  1. Press the "Add LadyGaga..." button to create the invalid HTML.

  2. Press "Save and Remove from Document" to save the structure in asdf and clear its contents. This prints to the console what was saved.

  3. Press "Restore" to restore the structure that was saved.

The code above aims to be general. It would be possible to simplify the code if some assumptions can be made about the HTML structure to be saved. For instance blah is not a well-formed XML document because you need a single top element in XML. So the code above takes pains to add a top-level element (top) to prevent this problem. It is also generally not possible to just parse an HTML serialization as XML so the save operation serializes to XML.

This is a proof-of-concept more than anything. There could be side-effects from moving nodes created in an HTML document to an XML document or the other way around that I have not anticipated. I've run the code above on Chrome and FF. I don't have IE at hand to run it there.

like image 115
Louis Avatar answered Oct 07 '22 05:10

Louis


This won't work for your most recent clarification, that you must have a string copy. Leaving it, though, for others who may have more flexibility.


Since using the DOM seems to allow you to preserve, to some degree, the invalid structure, and using innerHTML involves reparsing with (as you've observed) side-effects, we have to look at not using innerHTML:

You can clone the original, and then swap in the clone:

var e = document.getElementById('asdf')
snippet.log("1: " + e.innerHTML);
var clone = e.cloneNode(true);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
document.getElementById('outer').appendChild(insert);
snippet.log("2: " + e.innerHTML);
e.parentNode.replaceChild(clone, e);
e = clone;
snippet.log("3: " + e.innerHTML);

Live Example:

var e = document.getElementById('asdf')
snippet.log("1: " + e.innerHTML);
var clone = e.cloneNode(true);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
document.getElementById('outer').appendChild(insert);
snippet.log("2: " + e.innerHTML);
e.parentNode.replaceChild(clone, e);
e = clone;
snippet.log("3: " + e.innerHTML);
<div id="asdf">
  <p id="outer">
    <div>ladygaga</div>
  </p>
</div>

<!-- Script provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>

Note that just like the innerHTML solution, this will wipe out event handlers on the elements in question. You could preserve handlers on the outermost element by creating a document fragment and cloning its children into it, but that would still lose handlers on the children.


This earlier solution won't apply to you, but may apply to others in the future:

My earlier solution was to track what you changed, and undo the changes one-by-one. So in your example, that means removing the insert element:

var e = document.getElementById('asdf')
console.log("1: " + e.innerHTML);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
var outer = document.getElementById('outer');
outer.appendChild(insert);
console.log("2: " + e.innerHTML);
outer.removeChild(insert);
console.log("3: " + e.innerHTML);

var e = document.getElementById('asdf')
snippet.log("1: " + e.innerHTML);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
var outer = document.getElementById('outer');
outer.appendChild(insert);
snippet.log("2: " + e.innerHTML);
outer.removeChild(insert);
snippet.log("3: " + e.innerHTML);
<div id="asdf">
  <p id="outer">
    <div>ladygaga</div>
  </p>
</div>

<!-- Script provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
like image 24
T.J. Crowder Avatar answered Oct 07 '22 06:10

T.J. Crowder