Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove parent node without childs nodes

I have a question related to removing specific nodes from xml file.

Here is my sample of XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <nodeA attribute="1">
    <nodeB attribute="table">
      <nodeC attribute="500"></nodeC>
      <nodeC attribute="5"></nodeC>
    </nodeB>
    <nodeB attribute="3">
      <nodeC attribute="4"></nodeC>
      <nodeC attribute="5"></nodeC>
      <nodeC attribute="5"></nodeC>
    </nodeB>
    <nodeB attribute="placeHolder">
    <nodeB attribute="toRemove">
      <nodeB attribute="glass"></nodeB>
        <nodeE attribute="7"></nodeE>
      <nodeB attribute="glass"></nodeB>
      <nodeB attribute="glass"></nodeB>
    </nodeB>
    </nodeB>
    <nodeB attribute="3">
      <nodeC attribute="4"></nodeC>
      <nodeC attribute="5"></nodeC>
      <nodeC attribtue="5"></nodeC>
     </nodeB>
    <nodeB attribute="placeHolder">
    <nodeB attribute="toRemove">
      <nodeB attribute="glass"></nodeB>
        <nodeE attribute="7"></nodeE>
      <nodeB attribute="glass"></nodeB>
      <nodeB attribute="glass"></nodeB>
    </nodeB>
    </nodeB>
  </nodeA>
</root>

I would like to remove node nodeB="toRemove" without removing childrens of this node. After that I need to do same thing with nodeB attribute="placeHolder". Part of result would look like that:

     <nodeB attribute="3">
      <nodeC attribute="4"></nodeC>
      <nodeC attribute="5"></nodeC>
      <nodeC attribtue="5"></nodeC>
     </nodeB>
     <nodeB attribute="glass"></nodeB>
        <nodeE attribute="7"></nodeE>
     <nodeB attribute="glass"></nodeB>
     <nodeB attribute="glass"></nodeB>

I have been trying code like this to achive that:

        XmlNodeList nodeList = doc.SelectNodes("//nodeB[@attribute=\"toRemove\"]");

        foreach (XmlNode node in nodeList)
        {
            foreach (XmlNode child in node.ChildNodes)
            {
                node.ParentNode.AppendChild(child);
            }
            node.ParentNode.RemoveChild(node);
        }
        doc.Save(XmlFilePathSource);

I am able to locate node with desired attribute toRemove or placeHolder, however I am not able to move children of this nodes up by one level. Could you help me in this case? It can be solution with Linq, XDocument, XmlReader but I prefer working with XmlDocument. Thank you for any help you could provide me in advance.

EDIT:

In this case I have used slightly modified code(to preserve order) that Chuck Savage wrote bellow. Once to remove

  <nodeB attribute="toRemove"> </nodeB>

and then do the same with

  <nodeB attribute="placeHolder"></nodeB>

Here is slightly modified code

  XElement root = XElement.Load(XmlFilePathSource); 
  var removes = root.XPathSelectElements("//nodeB[@attribute=\"toRemove\"]");
  foreach (XElement node in removes.ToArray())
  {
    node.Parent.AddAfterSelf(node.Elements());
    node.Remove();
  }
  root.Save(XmlFilePathSource);

xslt approach provided by @MiMo is very useful as well in this case.

like image 625
wariacik Avatar asked Feb 19 '13 20:02

wariacik


1 Answers

The problem is that you cannot modify document nodes while enumerating on their children - you should create new nodes instead than trying to modify the existing ones, and that becomes a bit tricky using XmlDocument.

The easiest way to do this kind of transformation is using XSLT, i.e. applying this XSLT:

<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="nodeB[@attribute='toRemove' or @attribute='placeHolder']">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="text()">
  </xsl:template>

  <xsl:template match="@* | *">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

to the input file the output is:

<root>
  <nodeA attribute="1">
    <nodeB attribute="table">
      <nodeC attribute="500" />
      <nodeC attribute="5" />
    </nodeB>
    <nodeB attribute="3">
      <nodeC attribute="4" />
      <nodeC attribute="5" />
      <nodeC attribute="5" />
    </nodeB>
    <nodeB attribute="glass" />
    <nodeE attribute="7" />
    <nodeB attribute="glass" />
    <nodeB attribute="glass" />
    <nodeB attribute="3">
      <nodeC attribute="4" />
      <nodeC attribute="5" />
      <nodeC attribtue="5" />
    </nodeB>
    <nodeB attribute="glass" />
    <nodeE attribute="7" />
    <nodeB attribute="glass" />
    <nodeB attribute="glass" />
  </nodeA>
</root>

The code to apply the XSLT is simply:

  XslCompiledTransform transform = new XslCompiledTransform();
  transform.Load(@"c:\temp\nodes.xslt");
  transform.Transform(@"c:\temp\nodes.xml", @"c:\temp\nodes-cleaned.xml");

If it is not possible (or desirable) to use an external file for the XSLT it can be read from a string:

  string xsltString =
    @"<xsl:stylesheet 
      version='1.0' 
      xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>

      <xsl:output method=""xml"" indent=""yes""/>

      <xsl:template match=""nodeB[@attribute='toRemove' or @attribute='placeHolder']"">
        <xsl:apply-templates/>
      </xsl:template>

      <xsl:template match=""text()"">
      </xsl:template>

      <xsl:template match=""@* | *"">
        <xsl:copy>
          <xsl:apply-templates select=""@* | node()""/>
        </xsl:copy>
      </xsl:template>

    </xsl:stylesheet>";
  XslCompiledTransform transform = new XslCompiledTransform();
  using (StringReader stringReader = new StringReader(xsltString))
  using (XmlReader reader = XmlReader.Create(stringReader)) {
    transform.Load(reader);
  }
  transform.Transform(@"c:\temp\nodes.xml", @"c:\temp\nodes-cleaned.xml");    
like image 186
MiMo Avatar answered Oct 03 '22 12:10

MiMo