Javascript DOM, get node text without losing spacing info

Question

I am using javascript and want to traverse the HTML tree, getting all the text as it appears to the user. However, I am losing spacing information.

Let's say I have two docs:

<html>XXX<p>YY    YY</p><html>

<html>XXX<p>YY&nbsp;&nbsp;&nbsp;YY</p><html>

The first one will appear with 1 space between the Ys. The second will have 3 spaces. However, if I traverse the tree and, for each #text node, use:

text = node.nodeValue;

then the text for both nodes will have 3 spaces. I no longer know which one has the "real" nbsp spaces. I can use node.innerHTML for the p elements, which will show the nbsp, but I don't think that I can use innerHTML to get just the XXX text (without some kind of text subtraction).

I could just get innerHTML of the whole document and parse that. However, I also need to get the computed style of each element, which I am going to get using

window.getComputedStyle(theElement).getPropertyValue("text-align");

So, I will be traversing each node. Also, innerHTML shows the source as is, while traversing the nodes "fixes" html errors, adding end tags, etc. That's a good thing and something I'd like to keep.

bfavaretto · Accepted Answer

What if you test by charCode? I believe a regular space is 32, while   is 160.

Javascript DOM, get node text without losing spacing info

Tags:

javascript

dom

user984003

1 Answers

bfavaretto

Recent Activity

Donate For Us

Javascript DOM, get node text without losing spacing info

Tags:

javascript

dom

user984003

1 Answers

bfavaretto

Related questions

Recent Activity

Donate For Us