I am trying to parse XML files using Nokogiri, Ruby and XPath. I usually don't encounter any problem but with the following I can't make any xpath request:
doc = Nokogiri::HTML(open("myfile.xml"))
doc.("//Meta").count
# result ==> 0
doc.xpath("//Meta")
# result ==> []
doc.xpath(.).count
# result => 1
Here is an simplified version of my XML File
<Answer xmlns="test:com.test.search" context="hf%3D10%26target%3Dst0" last="0" estimated="false" nmatches="1" nslices="0" nhits="1" start="0">
<time>
...
</time>
<promoted>
...
</promoted>
<hits>
<Hit url="http://www.test.com/" source="test" collapsed="false" preferred="false" score="1254772" sort="0" mask="272" contentFp="4294967295" did="1287" slice="1">
<groups>
...
</groups>
<metas>
<Meta name="enligne">
<MetaString name="value">
</MetaString>
</Meta>
<Meta name="language">
<MetaString name="value">
fr
</MetaString>
</Meta>
<Meta name="text">
<MetaText name="value">
<TextSeg highlighted="false" highlightClass="0">
La
</TextSeg>
</MetaText>
</Meta>
</metas>
</Hit>
</hits>
<keywords>
...
</keywords>
<groups>
...
</groups>
How can I get all children of <Hit>
from this XML?
Include the namespace information when calling xpath
:
doc.xpath("//x:Meta", "x" => "test:com.test.search")
You can use the remove_namespaces!
method and save your day.
This is one of the most FAQ XPAth questions -- search for "XPath default namespace".
If there is no way to register a namespace for the default namespace and use the registered prefix (say "x"
in //x:Meta
) then use:
//*[name() = 'Meta` and namespace-uri()='test:com.test.search']
If it is known that Meta
can only belong to the default namespace, then the above can be shortened to:
//*[name() = 'Meta`]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With