Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: adding namespaces in lxml

Tags:

People also ask

Is XML and lxml are same?

lxml is a Python library which allows for easy handling of XML and HTML files, and can also be used for web scraping. There are a lot of off-the-shelf XML parsers out there, but for better results, developers sometimes prefer to write their own XML and HTML parsers.

Is lxml standard Python library?

There is a lot of documentation on the web and also in the Python standard library documentation, as lxml implements the well-known ElementTree API and tries to follow its documentation as closely as possible. The recipes in Fredrik Lundh's element library are generally worth taking a look at.

What is lxml Etree in Python?

lxml. etree supports parsing XML in a number of ways and from all important sources, namely strings, files, URLs (http/ftp) and file-like objects. The main parse functions are fromstring() and parse(), both called with the source as first argument.


I'm trying to specify a namespace using lxml similar to this example (taken from here):

<TreeInventory xsi:noNamespaceSchemaLocation="Trees.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
</TreeInventory>

I'm not sure how to add the Schema instance to use and also the Schema location. The documentation got me started, by doing something like:

>>> NS = 'http://www.w3.org/2001/XMLSchema-instance'
>>> TREE = '{%s}' % NS
>>> NSMAP = {None: NS}
>>> tree = etree.Element(TREE + 'TreeInventory', nsmap=NSMAP)
>>> etree.tostring(tree, pretty_print=True)
'<TreeInventory xmlns="http://www.w3.org/2001/XMLSchema-instance"/>\n'

I'm not sure how to specify it an instance though, and then also specify a location. It seems like this can be done with the nsmap keyword-arg in etree.Element, but I don't see how.