I am trying to query with XPath an html document parsed with lxml. The document is a straight html-only download of the page about Plastic in Wikipedia. Then I parse it with lxml disabling entity substitution to avoid an error with '&reg' <pre class="prettyprint"><code>from lxml import etree root = etree.parse("plastic.html",etree.XMLParser(resolve_entities=False)) </code></pre> Then, I retrieve the namespace url <pre class="prettyprint"><code>htmltag = root.iter().next() nsurl = htmltag.nsmap.values()[0] </code></pre> Now, I would like to use xpath queries on either 'root' or 'htmltag', but I am unable to do so. I have tried different ways, but the following seems to me the most correct form, which yields errors anyway. <pre class="prettyprint"><code>root.xpath('//ns:body',namespace={'ns',nsurl}) </code></pre> And this is what I get <pre class="prettyprint"><code>XPathResultError: Unknown return type: dict </code></pre> I am running the commands in an IPython console, but I don't think that might be the problem. What am I doing wrong?

This is a simple miss spell. You should use <code>namespaces</code> instead of <code>namespace</code>.

XPath with lxml failing

Tags:

I am trying to query with XPath an html document parsed with lxml. The document is a straight html-only download of the page about Plastic in Wikipedia. Then I parse it with lxml disabling entity substitution to avoid an error with '&reg'

Click to copy

from lxml import etree
root = etree.parse("plastic.html",etree.XMLParser(resolve_entities=False))

Then, I retrieve the namespace url

Click to copy

htmltag = root.iter().next()
nsurl = htmltag.nsmap.values()[0]

Now, I would like to use xpath queries on either 'root' or 'htmltag', but I am unable to do so. I have tried different ways, but the following seems to me the most correct form, which yields errors anyway.

Click to copy

root.xpath('//ns:body',namespace={'ns',nsurl})

And this is what I get

Click to copy

XPathResultError: Unknown return type: dict

I am running the commands in an IPython console, but I don't think that might be the problem. What am I doing wrong?

541

asked Feb 28 '12 00:02

el_technic0

1 Answers

This is a simple miss spell. You should use namespaces instead of namespace.

165

answered Oct 01 '22 18:10

Takaki Taniguchi

Related questions
                            
                                What to use instead of NavigationToolbar2TkAgg?
                            
                                Pandas: extract hour from timedelta
                            
                                How to escape single quotes in Unload
                            
                                How to set ipython/jupyter as the default python terminal for vscode?
                            
                                Is there any way I can download the pre-trained models available in PyTorch to a specific path?
                            
                                Pandas Split DataFrame using row index
                            
                                TypeError: append() missing 1 required positional argument: 'other'
                            
                                "DLL load failed" in Visual Studio Code when trying to open any Python library on Windows 10
                            
                                Check if value exists in file
                            
                                What is the equivalent of map<int, vector<int> > in Python?
                            
                                Reversible dictionary for python
                            
                                Replace SRC of all IMG elements using Parser
                            
                                Is there an equivalent in Scala to Python's more general map function?
                            
                                CPython vs. Jython vs. IronPython for cross-platform GUI development [closed]
                            
                                What is the Python equivalent of a Ruby class method?
                            
                                about Speed: Python VS Java
                            
                                Writing a string (with new lines) in Python
                            
                                Sending and Receiving arrays via Sockets
                            
                                Primality test in python [duplicate]
                            
                                Get emails with Python and poplib

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

XPath with lxml failing

Tags:

python

xpath

lxml

el_technic0

People also ask

1 Answers

Takaki Taniguchi

Recent Activity

Donate For Us