Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to prevent xml.ElementTree fromstring from dropping commentnode

I have tho following code fragment:

    from xml.etree.ElementTree import fromstring,tostring
    mathml = fromstring(input)
    for elem in mathml.getiterator():
        elem.tag = 'm:' + elem.tag
    return tostring(mathml)

When i input the following input:

<math>
  <a> 1 2 3 </a>  <b />
<foo>Uitleg</foo>
<!-- <bar> -->
</math>

It results in:

<m:math>
  <m:a> 1 2 3 </m:a>  <m:b />
<m:foo>Uitleg</m:foo>

</m:math>

How come? And how can I preserve the comment?

edit: I don't care for the exact xml library used, however, I should be able to do the pasted change to the tags. Unfortunately, lxml does not seem to allow this (and I cannot use proper namespace operations)

like image 999
markijbema Avatar asked Mar 23 '11 17:03

markijbema


People also ask

How do I parse XML in ElementTree?

There are two ways to parse the file using 'ElementTree' module. The first is by using the parse() function and the second is fromstring() function. The parse () function parses XML document which is supplied as a file whereas, fromstring parses XML when supplied as a string i.e within triple quotes.

What is ElementTree?

ElementTree is an important Python library that allows you to parse and navigate an XML document. Using ElementTree breaks down the XML document in a tree structure that is easy to work with.


1 Answers

You cannot with xml.etree, because its parser ignores comments (which is acceptable behaviour for an xml parser by the way). But you can if you use the (compatible) lxml library, which allows you to configure parser options.

from lxml import etree

parser = etree.XMLParser(remove_comments=False)
tree = etree.parse('input.xml', parser=parser)
# or alternatively set the parser as default:
# etree.set_default_parser(parser)

This would by far be the easiest option. If you really have to use xml.etree, you could try hooking up your own parser, although even then, comments are not officially supported: have a look at this example (from the author of xml.etree) (still seems to work in python 2.7 by the way)

like image 124
Steven Avatar answered Sep 18 '22 15:09

Steven