Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove all empty XElements

This one is a little tricky. Say I have this XmlDocument

<Object>
    <Property1>1</Property1>
    <Property2>2</Property2>
    <SubObject>
         <DeeplyNestedObject />
    </SubObject>
</Object>

I want to get back this

<Object>
    <Property1>1</Property1>
    <Property2>2</Property2>
</Object>

Since each of the children of <SubObject> are all empty elements I want to get rid of it. What makes it challenging is that you cant remove nodes as you're iterating over them. Any help would be much appreciated.

UPDATE Here's what I wound up with.

public XDocument Process()
{
    //Load my XDocument
    var xmlDoc = GetObjectXml(_source);

    //Keep track of empty elements
    var childrenToDelete = new List<XElement>();

    //Recursively iterate through each child node
    foreach (var node in xmlDoc.Root.Elements())
        Process(node, childrenToDelete);

    //An items marked for deletion can safely be removed here
    //Since we're not iterating over the source elements collection
    foreach (var deletion in childrenToDelete)
        deletion.Remove();

    return xmlDoc;
}

private void Process(XElement node, List<XElement> elementsToDelete)
{
    //Walk the child elements
    if (node.HasElements)
    {
        //This is the collection of child elements to be deleted 
        //for this particular node
        var childrenToDelete = new List<XElement>();

        //Recursively iterate each child
        foreach (var child in node.Elements())
            Process(child, childrenToDelete);

        //Delete all children that were marked as empty
        foreach (var deletion in childrenToDelete)
            deletion.Remove();

        //Since we just removed all this nodes empty children
        //delete it if there's nothing left
        if (node.IsEmpty)
            elementsToDelete.Add(node);
    }

    //The current leaf node is empty so mark it for deletion
    else if (node.IsEmpty)
        elementsToDelete.Add(node);
}

If anyone is interested in the use case for this it's for an ObjectFilter project I put together.

like image 830
Micah Avatar asked Jun 15 '12 14:06

Micah


1 Answers

It'll be rather slow, but you could do this:

XElement xml;
while (true) {
    var empties = xml.Descendants().Where(x => x.IsEmpty && !x.HasAttributes).ToList();
    if (empties.Count == 0)
        break;

    empties.ForEach(e => e.Remove());
}

To make it faster, you could walk up the parent nodes after the first iteration and see if they're empty.

XElement xml;
var empties = xml.Descendants().Where(x => x.IsEmpty && !x.HasAttributes).ToList();
while (empties.Count > 0) {
    var parents = empties.Select(e => e.Parent)
                         .Where(e => e != null)
                         .Distinct()    //In case we have two empty siblings, don't try to remove the parent twice
                         .ToList();

    empties.ForEach(e => e.Remove());

    //Filter the parent nodes to the ones that just became empty.
    parents.RemoveAll(e => e.IsEmpty && !e.HasAttributes);
    empties = parents;
}
like image 191
SLaks Avatar answered Nov 14 '22 20:11

SLaks