Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the entire document HTML as a string?

People also ask

How do I get the HTML code for a document?

On Firefox, you can also use the keyboard shortcut Command-U to view the source code of a webpage. On Chrome, the process is very similar. Navigate to the top menu item “View” and click on “Developer/View Source.” You can also use the keyboard shortcut Option-Command-U .

How do you do string in HTML?

Usage. You can create a String instance using plain text or HTML, for example both of these statements are valid: // From plain text var helloWorld = new HTMLString. String('Hello World'); // From HTML var helloWorldBold = new HTMLString.

How do I get all HTML pages?

Fire up Chrome and jump to the webpage you want to view the HTML source code. Right-click the page and click on “View Page Source,” or press Ctrl + U, to see the page's source in a new tab. A new tab opens along with all the HTML for the webpage, completely expanded and unformatted.


MS added the outerHTML and innerHTML properties some time ago.

According to MDN, outerHTML is supported in Firefox 11, Chrome 0.2, Internet Explorer 4.0, Opera 7, Safari 1.3, Android, Firefox Mobile 11, IE Mobile, Opera Mobile, and Safari Mobile. outerHTML is in the DOM Parsing and Serialization specification.

See quirksmode for browser compatibility for what will work for you. All support innerHTML.

var markup = document.documentElement.innerHTML;
alert(markup);

You can do

new XMLSerializer().serializeToString(document)

in browsers newer than IE 9

See https://caniuse.com/#feat=xml-serializer


I believe document.documentElement.outerHTML should return that for you.

According to MDN, outerHTML is supported in Firefox 11, Chrome 0.2, Internet Explorer 4.0, Opera 7, Safari 1.3, Android, Firefox Mobile 11, IE Mobile, Opera Mobile, and Safari Mobile. outerHTML is in the DOM Parsing and Serialization specification.

The MSDN page on the outerHTML property notes that it is supported in IE 5+. Colin's answer links to the W3C quirksmode page, which offers a good comparison of cross-browser compatibility (for other DOM features too).


I tried the various answers to see what is returned. I'm using the latest version of Chrome.

The suggestion document.documentElement.innerHTML; returned <head> ... </body>

Gaby's suggestion document.getElementsByTagName('html')[0].innerHTML; returned the same.

The suggestion document.documentElement.outerHTML; returned <html><head> ... </body></html> which is everything apart from the 'doctype'.

You can retrieve the doctype object with document.doctype; This returns an object, not a string, so if you need to extract the details as strings for all doctypes up to and including HTML5 it is described here: Get DocType of an HTML as string with Javascript

I only wanted HTML5, so the following was enough for me to create the whole document:

alert('<!DOCTYPE HTML>' + '\n' + document.documentElement.outerHTML);


You can also do:

document.getElementsByTagName('html')[0].innerHTML

You will not get the Doctype or html tag, but everything else...


document.documentElement.outerHTML

PROBABLY ONLY IE:

>     webBrowser1.DocumentText

for FF up from 1.0:

//serialize current DOM-Tree incl. changes/edits to ss-variable
var ns = new XMLSerializer();
var ss= ns.serializeToString(document);
alert(ss.substr(0,300));

may work in FF. (Shows up the VERY FIRST 300 characters from the VERY beginning of source-text, mostly doctype-defs.)

BUT be aware, that the normal "Save As"-Dialog of FF MIGHT NOT save the current state of the page, rather the originallly loaded X/h/tml-source-text !! (a POST-up of ss to some temp-file and redirect to that might deliver a saveable source-text WITH the changes/edits prior made to it.)

Although FF surprises by good recovery on "back" and a NICE inclusion of states/values on "Save (as) ..." for input-like FIELDS, textarea etc. , not on elements in contenteditable/ designMode...

If NOT a xhtml- resp. xml-file (mime-type, NOT just filename-extension!), one may use document.open/write/close to SET the appr. content to the source-layer, that will be saved on user's save-dialog from the File/Save menue of FF. see: http://www.w3.org/MarkUp/2004/xhtml-faq#docwrite resp.

https://developer.mozilla.org/en-US/docs/Web/API/document.write

Neutral to questions of X(ht)ML, try a "view-source:http://..." as the value of the src-attrib of an (script-made!?) iframe, - to access an iframes-document in FF:

<iframe-elementnode>.contentDocument, see google "mdn contentDocument" for appr. members, like 'textContent' for instance. 'Got that years ago and no like to crawl for it. If still of urgent need, mention this, that I got to dive in ...