Python has a library called Beautiful Soup that you can use to parse an HTML tree without creating 'get' requests in external web pages. I'm looking for the same in JavaScript, but I've only found jsdom and JSSoup (which seems unused) and if I'm correct, they only allow you to make requests.
I want a library in JavaScript which allows me to parse the entire HTML tree without getting CORS policy errors, that is, without making a request, just parsing it.
How can I do this?
In a browser context, you can use DOMParser:
const html = "<h1>title</h1>";
const parser = new DOMParser();
const parsed = parser.parseFromString(html, "text/html");
console.log(parsed.firstChild.innerText); // "title"
and in node you can use node-html-parser:
import { parse } from 'node-html-parser';
const html = "<h1>title</h1>";
const parsed = parse(html);
console.log(parsed.firstChild.innerText); // "title"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With