Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get a webpage as plain text without any html using javascript? [duplicate]

i am trying to find a way using javascript or jquery to write a function which remove all the html tags from a page and just give me the plain text of this page.

How this can be done? any ideas?

like image 631
Amr Elgarhy Avatar asked Dec 09 '22 15:12

Amr Elgarhy


2 Answers

IE & WebKit

document.body.innerText

Others:

document.body.textContent

(as suggested by Amr ElGarhy)

Most js frameworks implement a crossbrowser way to do this. This is usually implemented somewhat like this:

text = document.body.textContent || document.body.innerText;

It seems that WebKit keeps some formatting with textContent whereas strips everything with innerText.

like image 196
Jakub Hampl Avatar answered Dec 12 '22 03:12

Jakub Hampl


It depends on how much formatting you want to keep. But with jQuery you can do it like this:

jQuery(document.body).text();
like image 24
Wolph Avatar answered Dec 12 '22 03:12

Wolph