Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert &lt; into < in lxml, Python?

Tags:

python

lxml

There's a xml file:

<body>
    <entry>
         I go to <hw>to</hw> to school.
    </entry>
</body>

For some reason, I changed <hw> to &lt;hw&gt; and </hw> to &lt;/hw&gt; before parsing it with lxml parser.

<body>
    <entry>
         I go to &lt;hw&gt;to&lt;/hw&gt; to school.
    </entry>
</body>

But after modifying the parsed xml data, I want to get a <hw> element, not &lt;hw&gt;. How can I do that?

like image 678
user1610952 Avatar asked Oct 28 '25 04:10

user1610952


1 Answers

First find a unescape function:

from xml.sax.saxutils import unescape

entry=body[0]

unescape and replace it with the original:

body.replace(entry, e.fromstring(unescape(e.tounicode(entry))))
like image 64
Kabie Avatar answered Oct 29 '25 19:10

Kabie



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!