I have tho following code fragment:
from xml.etree.ElementTree import fromstring,tostring
mathml = fromstring(input)
for elem in mathml.getiterator():
elem.tag = 'm:' + elem.tag
return tostring(mathml)
When i input the following input
:
<math>
<a> 1 2 3 </a> <b />
<foo>Uitleg</foo>
<!-- <bar> -->
</math>
It results in:
<m:math>
<m:a> 1 2 3 </m:a> <m:b />
<m:foo>Uitleg</m:foo>
</m:math>
How come? And how can I preserve the comment?
edit: I don't care for the exact xml library used, however, I should be able to do the pasted change to the tags. Unfortunately, lxml does not seem to allow this (and I cannot use proper namespace operations)
There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.
ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with.
You cannot with xml.etree
, because its parser ignores comments (which is acceptable behaviour for an xml parser by the way). But you can if you use the (compatible) lxml library, which allows you to configure parser options.
from lxml import etree
parser = etree.XMLParser(remove_comments=False)
tree = etree.parse('input.xml', parser=parser)
# or alternatively set the parser as default:
# etree.set_default_parser(parser)
This would by far be the easiest option. If you really have to use xml.etree, you could try hooking up your own parser, although even then, comments are not officially supported: have a look at this example (from the author of xml.etree) (still seems to work in python 2.7 by the way)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With