I am using Python's xml minidom and all works well except that in text sequences it writes out "
escape characters instead of "
. This of course makes sense if a quote appears in a tag, but it bugs me in the text. How do I change this?
looking at the source (Python 3.2 if it matters), this is hardcoded in the _write_data() function. you would need to modify the writexml() method of TextNode - either by subclassing it or simply editing it - so that it didn't call that method, but instead did something similar to escape only < and >.
if you created a subclass outside of the package (instead of copying and hacking the package to make your own custom xmlminidom) then it looks like, with a little care, you could make things work. so you would create your own (subclass of) TextNode, modified as above and then, to add text to the DOM, you would add an instance of your new class (or replace existing text nodes with instances of that class). you would need to set the ownerDocument attribute. perhaps simplest would be to also subclass Document and fix the createTextNode() method.
but i don't see a simpler way of doing what you want. it might be best to use a better dom implementation.
ps i have no idea whether this behaviour is required by the xml spec, or not. update: a quick scan of http://www.w3.org/TR/2008/REC-xml-20081126/#syntax suggests that only < and & must be encoded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With