Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use xmlns declarations with XPath in Nokogiri

I'm using Nokogiri::XML to parse responses from Amazon SimpleDB. The response is something like:

<SelectResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">
  <SelectResult>
    <Item>
      <Attribute><Name>Foo</Name><Value>42</Value></Attribute>
      <Attribute><Name>Bar</Name><Value>XYZ</Value></Attribute>
    </Item>
  </SelectResult>
</SelectResponse>

If I just hand the response straight over to Nokogiri, all XPath queries (e.g. doc/"//Item/Attribute[Name='Foo']/Value") return an empty array. But if I remove the xmlns attribute from the SelectResponse tag, it works perfectly.

Is there some extra thing I need to do to account for the namespace declaration? This workaround feels horribly like a hack.

like image 256
Mark Rendle Avatar asked Nov 15 '09 13:11

Mark Rendle


2 Answers

That XPath query looks for elements that are not in any namespace. You need to tell your XPath processor that you are looking for elements in the http://sdb.amazonaws.com/doc/2007-11-07/ namespace.

One way to do that with Nokogiri is:

doc = Nokogiri::XML.parse(...)
doc.xpath("//aws:Item/aws:Attribute[Name='Foo']/aws:Value", {"aws" => "http://sdb.amazonaws.com/doc/2007-11-07/"})
like image 198
hrnt Avatar answered Oct 16 '22 08:10

hrnt


I found "Namespaces in XML" really helpful in understanding what's going on.

Basically if you have a namespace defined via xmlns=, you must use a namespace in your XPath searches.

So in your case, you could do one of three things:

  • Remove the xmlns attribute from the root SearchResponse. In that case your original, namespace-less XPath query will work.

  • Use the default namespace in your XPath query:

    doc/"//xmlns:Item/xmlns:Attribute[xmlns:Name='Foo']/xmlns:Value"
    
  • Define a custom namespace in the second argument of the xpath method and use that in your query, as shown in hrnt's solution above.

like image 37
Matt Zukowski Avatar answered Oct 16 '22 08:10

Matt Zukowski