Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse HTML from JavaScript in Firefox?

What is the best way to parse (get a DOM tree of) a HTML result of XmlHttpRequest in Firefox?

EDIT:

I do not have the DOM tree, I want to acquire it.

XmlHttpRequest's "responseXML" works only when the result is actual XML, so I have only responseText to work with.

The innerHTML hack doesn't seem to work with a complete HTML document (in <html></html>). - turns out it works fine.

like image 253
hmp Avatar asked May 20 '09 16:05

hmp


People also ask

How do you parse HTML?

HTML parsing involves tokenization and tree construction. HTML tokens include start and end tags, as well as attribute names and values. If the document is well-formed, parsing it is straightforward and faster. The parser parses tokenized input into the document, building up the document tree.

What is DOM parser in JavaScript?

The DOMParser interface provides the ability to parse XML or HTML source code from a string into a DOM Document . You can perform the opposite operation—converting a DOM tree into XML or HTML source—using the XMLSerializer interface.

Can we parse HTML?

HTML is a markup language with a simple structure. It would be quite easy to build a parser for HTML with a parser generator. Actually, you may not need even to do that, if you choose a popular parser generator, like ANTLR. That is because there are already available grammars ready to be used.


1 Answers

innerHTML should work just fine, e.g.

// This would be after the Ajax request:
var myHTML = XHR.responseText;
var tempDiv = document.createElement('div');
tempDiv.innerHTML = myHTML.replace(/<script(.|\s)*?\/script>/g, '');

// tempDiv now has a DOM structure:
tempDiv.childNodes;
tempDiv.getElementsByTagName('a'); // etc. etc.
like image 56
James Avatar answered Sep 28 '22 06:09

James