I have some sample XML where I am querying for nodes based on a date.
Sample XML document:
<?xml version="1.0" encoding="UTF-16" standalone="yes"?>
<NewDataSet>
<Table>
<EmployeeBankGUID>dc396ebe-c8a4-4a7f-85b5-b43c1890d6bc</EmployeeBankGUID>
<ValidFromDate>2012-02-01T00:00:00-05:00</ValidFromDate>
</Table>
<Table>
<EmployeeBankGUID>2406a5aa-0246-4cd7-bba5-bb17a993042b</EmployeeBankGUID>
<ValidFromDate>2013-02-01T00:00:00-05:00</ValidFromDate>
</Table>
<Table>
<EmployeeBankGUID>2af49699-579e-4beb-9ab0-a58b4bee3158</EmployeeBankGUID>
<ValidFromDate>2014-02-01T00:00:00-05:00</ValidFromDate>
</Table>
</NewDataSet>
So there are basically three dates:
Using MSXML I can query and filter by these dates using an XPath query:
/NewDataSet/Table[ValidFromDate>"2013-02-12"]
And this works, and returns an IXMLDOMNodeList
containing one item:
<Table>
<EmployeeBankGUID>2af49699-579e-4beb-9ab0-a58b4bee3158</EmployeeBankGUID>
<ValidFromDate>2014-02-01T00:00:00-05:00</ValidFromDate>
</Table>
That XPath query using using MSXML; the variant of xml that Microsoft created in the late 1990's, before the W3C standardized on a completely different form of XPath.
DOMDocument doc = new DOMDocument();
//...load the xml...
IXMLDOMNodeList nodes = doc.selectNodes('/NewDataSet/Table[ValidFromDate>"2013-02-12"]');
But that version of MSXML is not "standards compliant" (since it was created before there were standards). Since 2005 the recommended one, the one that follows the standards, the only one that has features I require is MSXML 6.
It's a simple change, just instantiate a DOMDocument60
class rather than a DOMDocument
class:
DOMDocument doc = new DOMDocument60();
//...load the xml...
IXMLDOMNodeList nodes = doc.selectNodes('/NewDataSet/Table[ValidFromDate>"2013-02-12"]');
Except the same XPath query returns nothing.
What is the "standards compliant" way to filtering a value by date?
You might be thinking that I might be thinking that XML is treating the 2013-02-01T00:00:00-05:00
as some sort of special date, when in reality it's a string. So maybe I should just think of it like string comparisons.
Which would work, except that it doesn't work. No string comparison works:
/NewDataSet/Table[ValidFromDate<"a"]
returns no nodes/NewDataSet/Table[ValidFromDate>"a"]
returns no nodes/NewDataSet/Table[ValidFromDate!="a"]
returns all nodes/NewDataSet/Table[ValidFromDate>"2014-02-12T00:00:00-05:00"]
returns no nodes/NewDataSet/Table[ValidFromDate<"2014-02-12T00:00:00-05:00"]
returns no nodes/NewDataSet/Table[ValidFromDate!="2014-02-12T00:00:00-05:00"]
returns no nodesWhat is the "standards compliant" way to achieve what used to work?
What is the "correct" way to XPath query for date strings?
Or, better yet, why are my XPath queries not working?
Or, better better yet, why does the query that used to work no longer work? What was the decision that was made that decided the syntax was bad. What were edge cases they were solving by "breaking" the query syntax?
Here's the final functional code, nearly in the language I use:
DOMDocument60 GetXml(String url)
{
XmlHttpRequest xml = CoServerXMLHTTP60.Create();
xml.Open('GET', url, False, '', '');
xml.Send(EmptyParam);
DOMDocument60 doc = xml.responseXML AS DOMDocument60;
//MSXML6 removed all kinds of features originally present (thanks W3C)
//Need to use Microsoft's proprietary extensions to get some of it back (thanks W3C)
doc.setProperty('SelectionNamespaces', 'xmlns:ms="urn:schemas-microsoft-com:xslt"');
return doc;
}
DOMDocument doc = GetXml('http://example.com/GetBanks.ashx?employeeID=12345');
//Finds future banks.
//Only works in MSXML3; intentionally broken in MSXML6 (thanks W3C):
//String qry = '/NewDataSet/Table[ValidFromDate > "2014-02-12"]';
//MSXML6 compatible version of doing the above (send complaints to W3C);
String qry = '/NewDataSet/Table[ms:string-compare(ValidFromDate, "2014-02-12") >= 0]';
IXMLDOMNodeList nodes = doc.selectNodes(qry);
What is the "correct" way to XPath query for date strings?
In XPath 1.0, there is no way to handle date strings, just think of time zone support. At least there is no correct way to handle them. Comparing strings will fail if timezones are different.
Or, better yet, why are my XPath queries not working?
XPath 1.0 only defines equality operators on strings, for greater/less than the values have to be converted to numbers.
Use ms:string-compare
which was introduced in MSXML 4.0.
/NewDataSet/Table[
ms:string-compare(ValidFromDate, "2014-02-12T00:00:00-05:00") > 0
]
What is the "standards compliant" way to achieve what used to work?
An alternative that also works in other XPath implementations (I tested it using xmllint
, which uses libxml
) might be to translate
away all non-string characters, so the string will be parseable as a number:
/NewDataSet/Table[
translate(ValidFromDate, "-:T", "") < translate("2014-02-12T00:00:00-05:00", "-:T", "")
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With