Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serialize HTMLDocument and then rendering it in the server?

After some Google search, I did not find anything fill my need. I want to save the current web page just as what it is. I mean, many web pages has Javascript executed and CSS changed, so after some user interactive, the web page may be different from the one when it is firstly loaded into browser. And I want to save the current web page state to the sever and rendering it in the server. Is there any Javascript library for this task? Thanks!

like image 783
Yang Bo Avatar asked Jan 09 '10 06:01

Yang Bo


3 Answers

Even simpler:

var serialized = document.documentElement.innerHTML

outerHTML instead of innerHTML would be better, but it doesn't work in Firefox.

Let's test it.

>>> document.body.style.color = 'red';
>>> document.documentElement.innerHTML
...
<body style="color: red;">
...
like image 55
NVI Avatar answered Nov 16 '22 18:11

NVI


I'm working on something rather similar and wanted to share a summary of what I'm noticing with the innerHTML in IE8, FF3.6, and CHROME 5.0

IE

  • Strips the quotes from around many of the element attributes
  • Singleton nodes aren't self closed
  • If the values on the elements change after the HTML has been loaded, it picks up the new values

FF, CHROME

  • Singleton nodes aren't self closed
  • If the values on the elements change after the HTML has been loaded, it does NOT pick up the new values. It only picks up the default values set in the HTML upon initial rendering.
like image 33
Zoey Avatar answered Nov 16 '22 19:11

Zoey


Serializing a complete web page is as simple as:

var serialized = document.body.innerHTML;

If you really need the full document, including the head, then:

var serialized =
    '<head>' +
        document.getElementsByTagName('head')[0].innerHTML +
    '</head><body>' +
        document.body.innerHTML +
    '</body>';

Now all you need to do is submit it via AJAX.

About server side rendering, it depends what you mean by rendering. I'm currently using wkhtmltopdf to implement a 'save as pdf' feature on my site. It uses webKit to render the HTML prior to generating the PDF so it fully supports CSS and javascript.

And if you need to save it to an image instead of a PDF file you can always use ghostscript to print the PDF to a JPG/PNG file.

like image 34
slebetman Avatar answered Nov 16 '22 18:11

slebetman