I have a HTML page, and I want the text only (all text nodes).
<span>hello <strong>sir</strong></span>
hello sir
You could use $('. gettext'). text(); in jQuery.
Just call the method html2text with passing the html text and it will return plain text.
Assuming you only want children of body
element...
<html><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<title> Example</title>
</head>
<body>
a <div>b<span>c</span></div>
</body></html>
var body = document.body;
var textContent = body.textContent || body.innerText;
console.log(textContent); // a bc
You need to check for textContent
because our good friend IE uses innerText
instead.
It is much easier if you have a library such as jQuery, i.e. $('body').text()
.
Also, it can be achieved on the server side, such as strip_tags()
in PHP. However, if you only wanted the body
element, you'd need to drill down to it using a DOM parser such as DOMDocument.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With