I'm using Nokogiri::XML to parse responses from Amazon SimpleDB. The response is something like:
<SelectResponse xmlns="http://sdb.amazonaws.com/doc/2007-11-07/">
<SelectResult>
<Item>
<Attribute><Name>Foo</Name><Value>42</Value></Attribute>
<Attribute><Name>Bar</Name><Value>XYZ</Value></Attribute>
</Item>
</SelectResult>
</SelectResponse>
If I just hand the response straight over to Nokogiri, all XPath queries (e.g. doc/"//Item/Attribute[Name='Foo']/Value"
) return an empty array. But if I remove the xmlns
attribute from the SelectResponse
tag, it works perfectly.
Is there some extra thing I need to do to account for the namespace declaration? This workaround feels horribly like a hack.
That XPath query looks for elements that are not in any namespace. You need to tell your XPath processor that you are looking for elements in the http://sdb.amazonaws.com/doc/2007-11-07/
namespace.
One way to do that with Nokogiri is:
doc = Nokogiri::XML.parse(...)
doc.xpath("//aws:Item/aws:Attribute[Name='Foo']/aws:Value", {"aws" => "http://sdb.amazonaws.com/doc/2007-11-07/"})
I found "Namespaces in XML" really helpful in understanding what's going on.
Basically if you have a namespace defined via xmlns=
, you must use a namespace in your XPath searches.
So in your case, you could do one of three things:
Remove the xmlns
attribute from the root SearchResponse
. In that case your original, namespace-less XPath query will work.
Use the default namespace in your XPath query:
doc/"//xmlns:Item/xmlns:Attribute[xmlns:Name='Foo']/xmlns:Value"
Define a custom namespace in the second argument of the xpath
method and use that in your query, as shown in hrnt's solution above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With