Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python etree control empty tag format

Tags:

python

xml

When creating an XML file with Python's etree, if we write to the file an empty tag using SubElement, I get:

<MyTag />

Unfortunately, our XML parser library used in Fortran doesn't handle this even though it's a correct tag. It needs to see:

<MyTag></MyTag>

Is there a way to change the formatting rules or something in etree to make this work?

like image 567
tpg2114 Avatar asked Sep 17 '12 13:09

tpg2114


People also ask

How XML is used in Python?

XML stands for eXtensible Markup Language. It was designed to store and transport small to medium amounts of data and is widely used for sharing structured information. Python enables you to parse and modify XML documents. In order to parse XML document, you need to have the entire XML document in memory.


2 Answers

As of Python 3.4, you can use the short_empty_elements argument for both the tostring() function and the ElementTRee.write() method:

>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.fromstring('<mytag/>'), short_empty_elements=False)
b'<mytag></mytag>'

In older Python versions, (2.7 through to 3.3), as a work-around you can use the html method to write out the document:

>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.fromstring('<mytag/>'), method='html')
'<mytag></mytag>'

Both the ElementTree.write() method and the tostring() function support the method keyword argument.

On even earlier versions of Python (2.6 and before) you can install the external ElementTree library; version 1.3 supports that keyword.

Yes, it sounds a little weird, but the html output mostly outputs empty elements as a start and end tag. Some elements still end up as empty tag elements; specifically <link/>, <input/>, <br/> and such. Still, it's that or upgrade your Fortran XML parser to actually parse standards-compliant XML!

like image 175
Martijn Pieters Avatar answered Sep 23 '22 05:09

Martijn Pieters


This was directly solved in Python 3.4. From then, the write method of xml.etree.ElementTree.ElementTree has the short_empty_elements parameter which:

controls the formatting of elements that contain no content. If True (the default), they are emitted as a single self-closed tag, otherwise they are emitted as a pair of start/end tags.

More details in the xml.etree documentation.

like image 28
khyox Avatar answered Sep 24 '22 05:09

khyox