Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

getElementsByTagName() equivalent for textNodes

Is there any way to get the collection of all textNode objects within a document?

getElementsByTagName() works great for Elements, but textNodes are not Elements.

Update: I realize this can be accomplished by walking the DOM - as many below suggest. I know how to write a DOM-walker function that looks at every node in the document. I was hoping there was some browser-native way to do it. After all it's a little strange that I can get all the <input>s with a single built-in call, but not all textNodes.

like image 376
levik Avatar asked Apr 05 '10 16:04

levik


People also ask

What is the syntax of getElementsByTagName ()?

The getElementsByTagName() method in HTML returns the collection of all the elements in the document with the given tag name. To extract any info just iterate through all the elements using the length property. Syntax: var elements = document.

What is getElementsByTagName?

The getElementsByTagName() method returns a collection of all elements with a specified tag name. The getElementsByTagName() method returns an HTMLCollection. The getElementsByTagName() property is read-only.

What is getElementsByTagName return?

getElementsByTagName() The getElementsByTagName method of Document interface returns an HTMLCollection of elements with the given tag name. The complete document is searched, including the root node.


1 Answers

Update:

I have outlined some basic performance tests for each of these 6 methods over 1000 runs. getElementsByTagName is the fastest but it does a half-assed job, as it does not select all elements, but only one particular type of tag ( i think p) and blindly assumes that its firstChild is a text element. It might be little flawed but its there for demonstration purpose and comparing its performance to TreeWalker. Run the tests yourselves on jsfiddle to see the results.

  1. Using a TreeWalker
  2. Custom Iterative Traversal
  3. Custom Recursive Traversal
  4. Xpath query
  5. querySelectorAll
  6. getElementsByTagName

Let's assume for a moment that there is a method that allows you to get all Text nodes natively. You would still have to traverse each resulting text node and call node.nodeValue to get the actual text as you would do with any DOM Node. So the issue of performance is not with iterating through text nodes, but iterating through all nodes that are not text and checking their type. I would argue (based on the results) that TreeWalker performs just as fast as getElementsByTagName, if not faster (even with getElementsByTagName playing handicapped).

 Ran each test 1000 times.  Method                  Total ms        Average ms -------------------------------------------------- document.TreeWalker          301            0.301 Iterative Traverser          769            0.769 Recursive Traverser         7352            7.352 XPath query                 1849            1.849 querySelectorAll            1725            1.725 getElementsByTagName         212            0.212 

Source for each method:

TreeWalker

function nativeTreeWalker() {     var walker = document.createTreeWalker(         document.body,          NodeFilter.SHOW_TEXT,          null,          false     );      var node;     var textNodes = [];      while(node = walker.nextNode()) {         textNodes.push(node.nodeValue);     } } 

Recursive Tree Traversal

function customRecursiveTreeWalker() {     var result = [];      (function findTextNodes(current) {         for(var i = 0; i < current.childNodes.length; i++) {             var child = current.childNodes[i];             if(child.nodeType == 3) {                 result.push(child.nodeValue);             }             else {                 findTextNodes(child);             }         }     })(document.body); } 

Iterative Tree Traversal

function customIterativeTreeWalker() {     var result = [];     var root = document.body;      var node = root.childNodes[0];     while(node != null) {         if(node.nodeType == 3) { /* Fixed a bug here. Thanks @theazureshadow */             result.push(node.nodeValue);         }          if(node.hasChildNodes()) {             node = node.firstChild;         }         else {             while(node.nextSibling == null && node != root) {                 node = node.parentNode;             }             node = node.nextSibling;         }     } } 

querySelectorAll

function nativeSelector() {     var elements = document.querySelectorAll("body, body *"); /* Fixed a bug here. Thanks @theazureshadow */     var results = [];     var child;     for(var i = 0; i < elements.length; i++) {         child = elements[i].childNodes[0];         if(elements[i].hasChildNodes() && child.nodeType == 3) {             results.push(child.nodeValue);         }     } } 

getElementsByTagName (handicap)

function getElementsByTagName() {     var elements = document.getElementsByTagName("p");     var results = [];     for(var i = 0; i < elements.length; i++) {         results.push(elements[i].childNodes[0].nodeValue);     } } 

XPath

function xpathSelector() {     var xpathResult = document.evaluate(         "//*/text()",          document,          null,          XPathResult.ORDERED_NODE_ITERATOR_TYPE,          null     );      var results = [], res;     while(res = xpathResult.iterateNext()) {         results.push(res.nodeValue);  /* Fixed a bug here. Thanks @theazureshadow */     } } 

Also, you might find this discussion helpful - http://bytes.com/topic/javascript/answers/153239-how-do-i-get-elements-text-node

like image 194
Anurag Avatar answered Oct 07 '22 17:10

Anurag