Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get full path to current node

Tags:

c#

.net

xpath

If I have an XPathNavigator positioned on a node, how can I get an XPath expression that represents the path to that node, from the root?

For example, if the XML is:

<data>
    <class name='dogs'>
        <item name='doberman />
        <item name='husky' />
    </class>
    <class name='cats'>
        <item name='persian' />
        <item name='tabby' />
    </class> </data>
</data>

...then the path to the persian cat could be expressed as /data/class[2]/item[1]

I can enumerate the ancestors of the node in question with SelectAncestors() (or I could iteratively climb up the parent relationship with SelectParent()), but that doesn't get me the positional information.

Would I have to evaluate an XPath using position() for each ancestor, or is there a better way to do this?

like image 420
Gary McGill Avatar asked Feb 10 '12 00:02

Gary McGill


2 Answers

Assuming you're interested only in the xpath of xml elements, I implemented a brute force algorithm (i.e. traversing the XML structure) as extension methods on XmlElement. This is very similar to @Zenexer's answer, although I had already started on my own version when he posted his.

Also, intrigued by Alexei's tip about performance, I created a sort of test case using a somewhat complex XML file lyring around here. Then I implemented two versions of the same algorithm; one that depends on PreviousSibling, and other that iterates nodes sequentially. A third version relied on XPath's position() function, but it didn't work as expected and was discarded.

While you should check for yourself, in my machine the results showed a significant performance advantage for the iterative version -- 1.7s against 21s scored by the siblings version.

Importart: these extension methods are declared inside a static class XmlElementExtension.

PreviousSibling version

    public static string GetXPath_UsingPreviousSiblings(this XmlElement element)
    {
        string path = "/" + element.Name;

        XmlElement parentElement = element.ParentNode as XmlElement;
        if (parentElement != null)
        {
            // Gets the position within the parent element, based on previous siblings of the same name.
            // However, this position is irrelevant if the element is unique under its parent:
            XPathNavigator navigator = parentElement.CreateNavigator();
            int count = Convert.ToInt32(navigator.Evaluate("count(" + element.Name + ")"));
            if (count > 1) // There's more than 1 element with the same name
            {
                int position = 1;
                XmlElement previousSibling = element.PreviousSibling as XmlElement;
                while (previousSibling != null)
                {
                    if (previousSibling.Name == element.Name)
                        position++;

                    previousSibling = previousSibling.PreviousSibling as XmlElement;
                }

                path = path + "[" + position + "]";
            }

            // Climbing up to the parent elements:
            path = parentElement.GetXPath_UsingPreviousSiblings() + path;
        }

        return path;
    }

Iterative version

    public static string GetXPath_SequentialIteration(this XmlElement element)
    {
        string path = "/" + element.Name;

        XmlElement parentElement = element.ParentNode as XmlElement;
        if (parentElement != null)
        {
            // Gets the position within the parent element.
            // However, this position is irrelevant if the element is unique under its parent:
            XmlNodeList siblings = parentElement.SelectNodes(element.Name);
            if (siblings != null && siblings.Count > 1) // There's more than 1 element with the same name
            {
                int position = 1;
                foreach (XmlElement sibling in siblings)
                {
                    if (sibling == element)
                        break;

                    position++;
                }

                path = path + "[" + position + "]";
            }

            // Climbing up to the parent elements:
            path = parentElement.GetXPath_SequentialIteration() + path;
        }

        return path;
    }

The test case

    private static void Measure(string functionName, int iterations, Action implementation)
    {
        Stopwatch watch = new Stopwatch();
        watch.Start();

        for (int i = 0; i < iterations; i++)
        {
            implementation();
        }

        watch.Stop();
        Console.WriteLine("{0}: {1}ms", functionName, watch.ElapsedMilliseconds);
    }

    private static void Main(string[] args)
    {
        XmlDocument doc = new XmlDocument();
        doc.Load(@"location of some large and complex XML file");

        string referenceXPath = "/vps/vendorProductSets/vendorProductSet/product[100]/prodName/locName";

        Measure("UsingPreviousSiblings", 10000,
                () =>
                    {
                        XmlElement target = doc.SelectSingleNode(referenceXPath) as XmlElement;
                        Debug.Assert(referenceXPath == target.GetXPath_UsingPreviousSiblings());
                    });

        Measure("SequentialIteration", 10000,
                () =>
                {
                    XmlElement target = doc.SelectSingleNode(referenceXPath) as XmlElement;
                    Debug.Assert(referenceXPath == target.GetXPath_SequentialIteration());
                });
    }
like image 181
Humberto Avatar answered Oct 31 '22 17:10

Humberto


Untested; only works with XPathNavigator objects created from XmlDocument objects:

private static string GetPath(this XPathNavigator navigator)
{
    StringBuilder path = new StringBuilder();
    for (XmlNode node = navigator.UnderlyingObject as XmlNode; node != null; node = node.ParentNode)
    {
        string append = "/" + path;

        if (node.ParentNode != null && node.ParentNode.ChildNodes.Count > 1)
        {
            append += "[";

            int index = 1;
            while (node.PreviousSibling != null)
            {
                index++;
            }

            append += "]";
        }

        path.Insert(0, append);
    }

    return path.ToString();
}

Here's how you would use it:

XPathNavigator navigator = /* ... */;
string path = navigator.GetPath();

However...

XPathNavigator objects are generally positioned on the root node. Once they are created, their positions cannot be changed, though you can use them to select descendants. Perhaps there is a way to avoid this problem altogether? For example, if you just want the current node, you can use XPathNavigator.UnderlyingObject, as in the sample.

like image 41
Zenexer Avatar answered Oct 31 '22 17:10

Zenexer