There's a xml file:
<body>
<entry>
I go to <hw>to</hw> to school.
</entry>
</body>
For some reason, I changed <hw> to <hw> and </hw> to </hw> before parsing it with lxml parser.
<body>
<entry>
I go to <hw>to</hw> to school.
</entry>
</body>
But after modifying the parsed xml data, I want to get a <hw> element, not <hw>. How can I do that?
First find a unescape function:
from xml.sax.saxutils import unescape
entry=body[0]
unescape and replace it with the original:
body.replace(entry, e.fromstring(unescape(e.tounicode(entry))))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With