I have a javascript variable containing the HTML source code of a page (not the source of the current page), I need to extract all links from this variable. Any clues as to what's the best way of doing this?
Is it possible to create a DOM for the HTML in the variable and then walk that?
I don't know if this is the recommended way, but it works: (JavaScript only)
var rawHTML = '<html><body><a href="foo">bar</a><a href="narf">zort</a></body></html>';
var doc = document.createElement("html");
doc.innerHTML = rawHTML;
var links = doc.getElementsByTagName("a")
var urls = [];
for (var i=0; i<links.length; i++) {
urls.push(links[i].getAttribute("href"));
}
alert(urls)
If you're using jQuery, you can really easily I believe:
var doc = $(rawHTML);
var links = $('a', doc);
http://docs.jquery.com/Core/jQuery#htmlownerDocument
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With