Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the innerText property of the html element only show the innerText of the body element?

console.log(document.getElementsByTagName('html')['0'].textContent);
console.log(document.getElementsByTagName('html')['0'].innerText);
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Document</title>
</head>
<body>

    <p>innnerHtml of paragraph</p>
</body>
</html>

The textContent property is printing all the text content inside the html element excluding the tags. It also prints all the white spaces and new lines. So to get the text without white spaces and new lines, I used the innerText property but it didn't print the text inside the title element and just printed the text inside the p element. Why didn't the innerText property work as I expected?

like image 523
Hash Avatar asked Jan 28 '23 06:01

Hash


2 Answers

Your below code working as it's intended behavior. I think you get confused about them. Have a look here at MDN

Couple of them :

  1. While textContent gets the content of all elements, including <script> and <style> elements, innerText does not, only showing human-readable elements.

  2. innerText is aware of styling and won’t return the text of hidden elements, whereas textContent does.

To remove white-space and new-line you can use regex replace.

// remove new-line and white space with replace
console.log(document.getElementsByTagName('html')['0'].textContent.replace(/[\n\r]+|[\s]{2,}/g, ' '));
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Document</title>
</head>
<body>

    <p>innnerHtml of paragraph</p>
</body>
</html>
like image 140
Always Sunny Avatar answered Jan 29 '23 20:01

Always Sunny


According to MDN:

Node.innerText is a property that represents the "rendered" text content of a node and its descendants. As a getter, it approximates the text the user would get if they highlighted the contents of the element with the cursor and then copied to the clipboard.

The contents of the <title> element aren't rendered as text content and certainly can not be highlighted or copied to clipboard. As such, it won't be returned by Node.innerText.

Interestingly, document.getElementsByTagName('title')['0'].innerText does return the contents of the <title> element. Did a bit of reading on this and it's explained in the spec:

If this element is not being rendered, or if the user agent is a non-CSS user agent, then return the same value as the textContent IDL attribute on this element.

This step can produce surprising results, as when the innerText attribute is accessed on an element not being rendered, its text contents are returned, but when accessed on an element that is being rendered, all of its children that are not being rendered have their text contents ignored.

like image 20
Eaten by a Grue Avatar answered Jan 29 '23 21:01

Eaten by a Grue