Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python XML: write " instead of &quot

I am using Python's xml minidom and all works well except that in text sequences it writes out &quot escape characters instead of ". This of course makes sense if a quote appears in a tag, but it bugs me in the text. How do I change this?

like image 807
foges Avatar asked Aug 11 '11 17:08

foges


1 Answers

looking at the source (Python 3.2 if it matters), this is hardcoded in the _write_data() function. you would need to modify the writexml() method of TextNode - either by subclassing it or simply editing it - so that it didn't call that method, but instead did something similar to escape only < and >.

if you created a subclass outside of the package (instead of copying and hacking the package to make your own custom xmlminidom) then it looks like, with a little care, you could make things work. so you would create your own (subclass of) TextNode, modified as above and then, to add text to the DOM, you would add an instance of your new class (or replace existing text nodes with instances of that class). you would need to set the ownerDocument attribute. perhaps simplest would be to also subclass Document and fix the createTextNode() method.

but i don't see a simpler way of doing what you want. it might be best to use a better dom implementation.

ps i have no idea whether this behaviour is required by the xml spec, or not. update: a quick scan of http://www.w3.org/TR/2008/REC-xml-20081126/#syntax suggests that only < and & must be encoded.

like image 106
andrew cooke Avatar answered Sep 22 '22 21:09

andrew cooke