Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

jQuery parse HTML without loading images

I load HTML from other pages to extract and display data from that page:

$.get('http://example.org/205.html', function (html) {
    console.log( $(html).find('#c1034') );
});

That does work but because of the $(html) my browser tries to load images that are linked in 205.html. Those images do not exist on my domain so I get a lot of 404 errors.

Is there a way to parse the page like $(html) but without loading the whole page into my browser?

like image 685
PiTheNumber Avatar asked Feb 27 '13 13:02

PiTheNumber


2 Answers

Actually if you look in the jQuery documentation it says that you can pass the "owner document" as the second argument to $.

So what we can then do is create a virtual document so that the browser does not automatically load the images present in the supplied HTML:

var ownerDocument = document.implementation.createHTMLDocument('virtual');
$(html, ownerDocument).find('.some-selector');
like image 43
Thomas Brus Avatar answered Sep 20 '22 12:09

Thomas Brus


Use regex and remove all <img> tags

 html = html.replace(/<img[^>]*>/g,"");
like image 186
Bhuvan Rikka Avatar answered Sep 22 '22 12:09

Bhuvan Rikka