Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

removeChild() : how to remove indent too?

Let's consider the following XML document:

<items>
   <item>item1</item>
   <item>item2</item>
</items>

Now, let's remove all the items and add some new item. Code:

  //-- assume we have Element instance of <items> element: items_parent
  //   and the Document instance: doc

  //-- remove all the items
  NodeList items = items_parent.getElementsByTagName("item");

  for (int i = 0; i < items.getLength(); i++){
     Element curElement = (Element)items.item(i);
     items_parent.removeChild(curElement);
  }

  //-- add a new one
  Element new_item = doc.createElement("item");
  new_item.setTextContent("item3");
  items_parent.appendChild(new_item);

New contents of the file:

<items>


   <item>item3</item>
</items>

These annoying blank lines appeared because removeChild() removes child, but it leaves indent of removed child, and line break too. And this indent_and_like_break is treated as a text content, which is left in the document.

In the related question I posted workaround:

items_parent.setTextContent("");

It removes these blank lines. But this is some kink of hack, it removes the effect, not the cause.

So, the question is about removing the cause: how to remove child with its intent with line break?

like image 481
Dmitry Frank Avatar asked Jan 10 '13 09:01

Dmitry Frank


2 Answers

The "indent" before the element and the "carriage return" (and following indent) after it are text nodes. If you remove an element and there's a text node before or after it, naturally those nodes are not removed.

It sounds as though you want to remove the element, and then also remove the text node in front of it (provided it consists entirely of whitespace).

E.g., perhaps along these lines (in your loop removing items):

 Element curElement = (Element)items.item(i);
 // Start new code
 Node prev = curElement.getPreviousSibling();
 if (prev != null && 
     prev.getNodeType() == Node.TEXT_NODE &&
     prev.getNodeValue().trim().length() == 0) {
     items_parent.removeChild(prev);
 }
 // End new code
 items_parent.removeChild(curElement);

However, the real question should probably be why your XML document has extraneous whitespace text nodes in it.

like image 70
T.J. Crowder Avatar answered Sep 30 '22 01:09

T.J. Crowder


Actually an XML document does not have to follow any style guidelines. Therefore you cannot except document manipulation methods to keep some kind of style to your document.

What I'd recommend is generally manipulating your file first without any respect to the format (just get a valid xml file then) and afterwards you can always run a formatter over the whole document to get your desired formatting.

like image 40
s1lence Avatar answered Sep 30 '22 00:09

s1lence