Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why did the definition of dot (.) change between XPath 1.0 and 2.0?

When researching details for an answer to an XPath question here on Stack Overflow, I run into a difference between XPath 1.0 and 2.0 I can find no rationale for.

I tried to understand what . really means.

  • In XPath 1.0, . is an abbreviation for self::node(). Both self and node are crystal-clear to me.
  • In XPath 2.0, . is primary expression "context item expression". Abbreviated Syntax section explicitly states that as a note.

What was the rationale for the change? Is there a difference between . and self::node() in XPath 2.0?

From the spec itself, the intent of the change is not clear to me. I tried googling keywords like dot or period, primary expression, and rationale.

like image 491
Palec Avatar asked Sep 02 '16 19:09

Palec


People also ask

What does dot in an XPath expression represent?

The dot, or period, character (“.”) in XPath is called the “context item expression” because it refers to the context item. This could be a node (such as an element, attribute, or text node), or an atomic value (such as a string, number, or boolean). When it's a node, it's also called the context node.

What does double dot mean in XPath?

A double dot is the abbreviation for parent::node() . This selects the parent of the context node. For example, the following two XPath expressions both return the title children of the parent of the context node: ../title parent::node()/child::title.

What does and mean in XPath?

W.R.T to your first questions, if we don't use '. ' (dot) at the beginning, then you will be basically selecting all element nodes with an @id-attribute-value equal to 'Passwd' from the entire document. By adding '//*' in XPath you would be selecting all the element nodes from the entire document.

What is the function of XPath?

XPath Standard Functions There are functions for string values, numeric values, booleans, date and time comparison, node manipulation, sequence manipulation, and much more. Today XPath expressions can also be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.


2 Answers

XPath 1.0 had four data types: string, number, boolean, and node-set. There was no way of handling collections of values other than nodes. This meant, for example, that there was no way of summing over derived values (if elements had attributes of the form price='$23.95', there was no way of summing over the numbers obtained by stripping off the $ sign because the result of such stripping would be a set of numbers, and there was no such data type).

So XPath 2.0 introduced more general sequences, and that meant that the facilities for manipulating sequences had to be generalised; for example if $X is a sequence of numbers, then $X[. > 0] filters the sequence to include only the positive numbers. But that only works if "." can refer to a number as well as to a node.

like image 146
Michael Kay Avatar answered Oct 17 '22 04:10

Michael Kay


In short: self::node() filters out atomic items, while . does not. Atomic items (numbers, strings, and many other XML Schema types) are not nodes (unlike elements, attributes, comments, etc.).

Consider the example from the spec: (1 to 100)[. mod 5 eq 0]. If the . is replaced by self::node(), the expression is not valid XPath, because mod requires both arguments to be numeric and atomization does not help in this case.

For those scanning the spec: XPath 2.0 defines item() type-matching construct, but it has nothing to do with node tests as atomics are not nodes and axis steps always return just nodes. Therefore, dot cannot be defined as self::item(). It really needs to be a special language construct.

like image 4
Palec Avatar answered Oct 17 '22 06:10

Palec