Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XPath queries in IE use zero-based indexes but the W3C spec is one-based. How should I handle the difference?

The Problem

I am converting a relatively large piece of Javascript that currently only works on Internet Explorer in order to make it work on the other browsers as well. Since the code uses XPath extensively we made a little compatibility function to make things easier

function selectNodes(xmlDoc, xpath){
    if('selectNodes' in xmlDoc){
        //use IE logic
    }else{
        //use W3C's document.evaluate
    }
}

This is mostly working fine but we just came across the limitation that positions in IE are zero-based but in the W3C model used by the other browsers they are one-based. This means that to get the first element we need to do //books[0] in IE and //books[1] in the other browsers.

My proposed solution

The first thought was using a regex to add one to all indexes that appear in the queries if we are using the document.evaluate version:

function addOne(n){ return 1 + parseInt(nStr, 10); }

xpath = xpath.replace(
    /\[\s*(\d+)\s*\]/g,
    function(_, nStr){ return '[' + addOne(nStr) + ']'; }
);

My question

Is this regex based solution reasonably safe?

  • Are there any places it will convert something it should not?
  • Are there any places where it will not convert something it should?

For example, it would fail to replace the index in //books[position()=1] but since IE doesn't appear to support position() and our code is not using that I think this particular case would not be a problem.


Considerations

  • I downloaded Sarissa to see if they have a way to solve this but after looking at the source code apparently they don't?

  • I want to add one to the W3C version instead of subtracting one in the IE version to ease my conversion effort.


In the end

We decided to rewrite the code to use proper XPath in IE too by setting the selection language

xmlDoc.setProperty("SelectionLanguage", "XPath");
like image 261
hugomg Avatar asked Dec 22 '11 14:12

hugomg


People also ask

Is XPath 0 or 1 based?

Note also, index values in XPath predicates (technically, 'proximity positions' of XPath node sets) start from 1, not 0 as common in languages like C and Java.

How do I create an index in XPath?

In index, we can write the expression with the braces, and then we can write the index outside into the braces. Internally it will be executing the expression first and storing the result in an array of XPath whose index is starting from 1.


1 Answers

we just came across the limitation that positions in IE are zero-based but in the W3C model used by the other browsers they are one-based. This means that to get the first element we need to do //books[0] in IE and //books[1] in the other browsers.

Before doing any XPath selection, specify:

xmlDoc.setProperty("SelectionLanguage", "XPath");

MSXML3 uses a dialect of XSLT/XPath that was in use before XSLT and XPath became W3C Recommendations. The default is "XSLPattern" and this is what you see as behavior.

Read more on this topic here:

http://msdn.microsoft.com/en-us/library/windows/desktop/ms754679(v=vs.85).aspx

like image 113
Dimitre Novatchev Avatar answered Oct 14 '22 22:10

Dimitre Novatchev