Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse a HTML String with JS without triggering any page loads?

As this answer indicates, a good way to parse HTML in JavaScript is to simply re-use the browser's HTML-parsing capabilities like so:

var el = document.createElement( 'html' );
el.innerHTML = "<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>";
// process 'el' as desired

However, this triggers loading extra pages for certain HTML strings, for example:

var foo = document.createElement('div')
foo.innerHTML = '<img src="http://example.com/img.png">';

As soon as this example is run, the browser attempts to load the page:

enter image description here

How might I process HTML from JavaScript without this behavior?

like image 978
Claudiu Avatar asked Feb 28 '26 19:02

Claudiu


2 Answers

If you want to parse HTML response without loading any unnecessary resources like images or scripts inside, use DOMImplementation’s createHTMLDocument() to create new document which is not connected to the current one parsed by the browser and behaves as well as normal document.

like image 150
Waqas Amjad Avatar answered Mar 03 '26 07:03

Waqas Amjad


I don't know if there is a perfect solution for this, but since this is merely for processing, you can before assigning innerHTMl replace all src attributes to be notSrc="xyz.com", this way it wont be loaded, and if you need them later in processing you can account for this. The browser mainly will load images, scripts, and css files, this will fix the first 2, the css can be done by replacing the href attribute.

like image 24
MoustafaS Avatar answered Mar 03 '26 07:03

MoustafaS



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!