Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sort XML Nodes by Alpha.Numeric using C#

Say I have an XmlDocument that I generate that has InnerXml that looks like this:

<ORM_O01>
  <MSH>
    <MSH.9>
      <MSG.2>O01</MSG.2>
    </MSH.9>
    <MSH.6>
      <HD.1>13702</HD.1>
    </MSH.6>
  </MSH>
  <ORM_O01.PATIENT>
   <PID>      
     <PID.18>
       <CX.1>SecondTestFin</CX.1>
     </PID.18>
     <PID.3>
        <CX.1>108</CX.1>
     </PID.3>
   </PID>
  </ORM_O01.PATIENT>
</ORM_O01>

As you can see node <PID.18> is before node <PID.3>. (<MSH.9> is also before <MSH.6>.)

Restructuring my generation would cause my nice clean code to become very messy.

Is there a way to sort the nodes so that it will sort alpha until it hits the last period then sort numeric (if the last values are numbers)?

By "numeric sorting" I mean it will look at the whole number rather than char by char. (So 18 > 3).

like image 892
Vaccano Avatar asked Jan 05 '12 20:01

Vaccano


1 Answers

The obvious answer is yes.

If this is the result you want:

<ORM_O01>
  <MSH>
    <MSH.6>
      <HD.1>13702</HD.1>
    </MSH.6>
    <MSH.9>
      <MSG.2>O01</MSG.2>
    </MSH.9>
  </MSH>
  <ORM_O01.PATIENT>
    <PID>
      <PID.3>
        <CX.1>108</CX.1>
      </PID.3>
      <PID.18>
        <CX.1>SecondTestFin</CX.1>
      </PID.18>
    </PID>
  </ORM_O01.PATIENT>
</ORM_O01>

Then this class will do it: (I should get paid for this...)

using System;
using System.IO;
using System.Linq;
using System.Xml.Linq;

namespace Test
{
    public class SortXmlFile
    {
        XElement rootNode;

        public SortXmlFile(FileInfo file)
        {
            if (file.Exists)
                rootNode = XElement.Load(file.FullName);
            else
                throw new FileNotFoundException(file.FullName);
        }

        public XElement SortFile()
        {
            SortElements(rootNode);
            return rootNode;
        }

        public void SortElements(XElement root)
        {
            bool sortWithNumeric = false;
            XElement[] children = root.Elements().ToArray();
            foreach (XElement child in children)
            {
                string name;
                int value;
                // does any child need to be sorted by numeric?
                if (!sortWithNumeric && Sortable(child, out name, out value))
                    sortWithNumeric = true;
                child.Remove(); // we'll re-add it in the sort portion
                // sorting child's children
                SortElements(child);
            }
            // re-add children after sorting

            // sort by name portion, which is either the full name, 
            // or name that proceeds period that has a numeric value after the period.
            IOrderedEnumerable<XElement> childrenSortedByName = children
                    .OrderBy(child =>
                        {
                            string name;
                            int value;
                            Sortable(child, out name, out value);
                            return name;
                        });
            XElement[] sortedChildren;
            // if needed to sort numerically
            if (sortWithNumeric)
            {
                sortedChildren = childrenSortedByName
                    .ThenBy(child =>
                        {
                            string name;
                            int value;
                            Sortable(child, out name, out value);
                            return value;
                        })
                        .ToArray();
            }
            else
                sortedChildren = childrenSortedByName.ToArray();

            // re-add the sorted children
            foreach (XElement child in sortedChildren)
                root.Add(child);
        }

        public bool Sortable(XElement node, out string name, out int value)
        {
            var dot = new char[] { '.' };
            name = node.Name.ToString();
            if (name.Contains("."))
            {
                string[] parts = name.Split(dot);
                if (Int32.TryParse(parts[1], out value))
                {
                    name = parts[0];
                    return true;
                }
            }
            value = -1;
            return false;
        }
    }
}

Someone may be able to write this cleaner and meaner, but this should get you going.

like image 153
Chuck Savage Avatar answered Oct 26 '22 11:10

Chuck Savage