I have a (old) tool which does not understand self-closing tags like <STATUS/>
. So, we need to serialize our XML files with opened/closed tags like this: <STATUS></STATUS>
.
Currently I have:
>>> from lxml import etree
>>> para = """<ERROR>The status is <STATUS></STATUS>.</ERROR>"""
>>> tree = etree.XML(para)
>>> etree.tostring(tree)
'<ERROR>The status is <STATUS/>.</ERROR>'
How can I serialize with opened/closed tags?
<ERROR>The status is <STATUS></STATUS>.</ERROR>
Solution
Given by wildwilhelm, below:
>>> from lxml import etree
>>> para = """<ERROR>The status is <STATUS></STATUS>.</ERROR>"""
>>> tree = etree.XML(para)
>>> for status_elem in tree.xpath("//STATUS[string() = '']"):
... status_elem.text = ""
>>> etree.tostring(tree)
'<ERROR>The status is <STATUS></STATUS>.</ERROR>'
It seems like the <STATUS>
tag gets assigned a text
attribute of None
:
>>> tree[0]
<Element STATUS at 0x11708d4d0>
>>> tree[0].text
>>> tree[0].text is None
True
If you set the text
attribute of the <STATUS>
tag to an empty string, you should get what you're looking for:
>>> tree[0].text = ''
>>> etree.tostring(tree)
'<ERROR>The status is <STATUS></STATUS>.</ERROR>'
With this is mind, you can probably walk a DOM tree and fix up text
attributes before writing out your XML. Something like this:
# prevent creation of self-closing tags
for node in tree.iter():
if node.text is None:
node.text = ''
If you tostring lxml dom is HTML
, you can use
etree.tostring(html_dom, method='html')
to prevent self-closing tag like <a />
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With