Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using XPath to find node under a context node does not work (firefox/firebug/javascript)

Tags:

I want to extract information from a web page.

The page has m nodes, which can be found by .evaluate("//div[@class='news']", document, ....).

For each of the above nodes, there are 3 nodes inside them. Each of them has different @class selector. And I want to extract these m 3-tuple records.

I tried to use .evaluate() function as instructed in

https://developer.mozilla.org/en/Introduction_to_using_XPath_in_JavaScript

by using this code

parentNodes = document.evaluate("//div[@class='news']", document, ....).
while (true){
   var node = parentNodes.iterateNext();
   var child = document.evaluate("//div[@class='title']", node, ....). 
   ...
}

However, "child" is always assigned to the first node in the document, instead of the first node within "node".

I ran this in firebug console.

Does any one know what's wrong?

like image 591
manova Avatar asked Mar 07 '10 23:03

manova


2 Answers

You are calling evaluate on the document. Hence, the XPath expression is being evaluated from the root of the XML tree. Also, if you want XPath to select a node from within the current context, e.g. among the children of the current node, you should use the .// context selector.

like image 96
Anatoly Fayngelerin Avatar answered Jan 14 '23 17:01

Anatoly Fayngelerin


If you start an XPath expression with "/" then you are starting down from the root node/document node of the context node. So instead of "//div[@class = 'title']" use "descendant::div[@class = 'title']", that way you are selecting the descendant div elements of the context node.

like image 44
Martin Honnen Avatar answered Jan 14 '23 16:01

Martin Honnen