I want to extract information from a web page.
The page has m nodes, which can be found by .evaluate("//div[@class='news']", document, ....).
For each of the above nodes, there are 3 nodes inside them. Each of them has different @class selector. And I want to extract these m 3-tuple records.
I tried to use .evaluate() function as instructed in
https://developer.mozilla.org/en/Introduction_to_using_XPath_in_JavaScript
by using this code
parentNodes = document.evaluate("//div[@class='news']", document, ....).
while (true){
var node = parentNodes.iterateNext();
var child = document.evaluate("//div[@class='title']", node, ....).
...
}
However, "child" is always assigned to the first node in the document, instead of the first node within "node".
I ran this in firebug console.
Does any one know what's wrong?
You are calling evaluate on the document. Hence, the XPath expression is being evaluated from the root of the XML tree. Also, if you want XPath to select a node from within the current context, e.g. among the children of the current node, you should use the .//
context selector.
If you start an XPath expression with "/" then you are starting down from the root node/document node of the context node. So instead of "//div[@class = 'title']"
use "descendant::div[@class = 'title']"
, that way you are selecting the descendant div elements of the context node.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With