Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing DOM nodes when traversing a NodeList

Tags:

java

dom

xml

I'm about to delete certain elements in an XML document, using code like the following:

NodeList nodes = ...;
for (int i = 0; i < nodes.getLength(); i++) {
  Element e = (Element)nodes.item(i);
  if (certain criteria involving Element e) {
    e.getParentNode().removeChild(e);
  }
}

Will this interfere with proper traversal of the NodeList? Any other caveats with this approach? If this is totally wrong, what's the proper way to do it?

like image 910
skiphoppy Avatar asked Sep 03 '09 15:09

skiphoppy


People also ask

How can the nodes in the NodeList be accessed?

The nodes can be accessed by index numbers. The index starts at 0. Both have a length property that returns the number of elements in the list (collection). An HTMLCollection is a collection of document elements.

What is NodeList in XML?

The NodeList object represents an ordered list of nodes.

What is NodeList in console?

NodeList objects are collections of nodes, usually returned by properties such as Node. childNodes and methods such as document. querySelectorAll() . Note: Although NodeList is not an Array , it is possible to iterate over it with forEach() . It can also be converted to a real Array using Array.


2 Answers

Removing nodes while looping will cause undesirable results, e.g. either missed or duplicated results. This isn't even an issue with synchronization and thread safety, but if the nodes are modified by the loop itself. Most of Java's Iterator's will throw a ConcurrentModificationException in such a case, something that NodeList does not account for.

It can be fixed by decrementing NodeList size and by decrementing iteraror pointer at the same time. This solution can be used only if we proceed one remove action for each loop iteration.

NodeList nodes = ...;
for (int i = nodes.getLength() - 1; i >= 0; i--) {
  Element e = (Element)nodes.item(i);
   if (certain criteria involving Element e) {
    e.getParentNode().removeChild(e);
  }
}
like image 112
Algok Avatar answered Oct 27 '22 19:10

Algok


So, given that removing nodes while traversing the NodeList will cause the NodeList to be updated to reflect the new reality, I assume that my indices will become invalid and this will not work.

So, it seems the solution is to keep track of the elements to delete during the traversal, and delete them all afterward, once the NodeList is no longer used.

NodeList nodes = ...;
Set<Element> targetElements = new HashSet<Element>();
for (int i = 0; i < nodes.getLength(); i++) {
  Element e = (Element)nodes.item(i);
  if (certain criteria involving Element e) {
    targetElements.add(e);
  }
}
for (Element e: targetElements) {
  e.getParentNode().removeChild(e);
}
like image 36
skiphoppy Avatar answered Oct 27 '22 18:10

skiphoppy