I have some lxml
element:
>> lxml_element.text
'hello BREAK world'
I need to replace the word BREAK
with an HTML break tag—<br />
. I've tried to do simple text replacing:
lxml_element.text.replace('BREAK', '<br />')
but it inserts the tag with escaped symbols, like <br/>
. How do I solve this problem?
Here's how you could do it. Setting up a sample lxml from your question:
>>> import lxml
>>> some_data = "<b>hello BREAK world</b>"
>>> root = lxml.etree.fromstring(some_data)
>>> root
<Element b at 0x3f35a50>
>>> root.text
'hello BREAK world'
Next, create a subelement tag <br>:
>>> childbr = lxml.etree.SubElement(root, "br")
>>> childbr
<Element br at 0x3f35b40>
>>> lxml.etree.tostring(root)
'<b>hello BREAK world<br/></b>'
But that's not all you want. You have to take the text before the <br> and place it in .text
:
>>> root.text = "hello"
>>> lxml.etree.tostring(root)
'<b>hello<br/></b>'
Then set the .tail
of the child to contain the rest of the text:
>>> childbr.tail = "world"
>>> lxml.etree.tostring(root)
'<b>hello<br/>world</b>'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With