lxml

Name: Basic HTML - Creating line breaks in your webpage with the BR tag
Uploaded: 2022-09-16 08:27:29
Description: lxml - ignore tag in htmlI wrote a tiny html-parser in Python using lxml. It's very useful, but I

Question

I wrote a tiny html-parser in Python using lxml. It's very useful, but I have a problem.

I have the following code:

tags = doc.xpath('//table//tr/td[@align="right"]/b')
for tag in tags:
    print(x.text.strip())

It works fine. But if there is a   tag inside a  element, like this:

<b> first-half <br>
    second-half </b>

this code will only print first-half into the  tag.

How can I get all of text in  even if there is a   tag?

Thanks.

Anorov · Accepted Answer

Use text_content() to extract all of the non-markup text within a tag. Replace x.text with x.text_content().

lxml - ignore <br> tag in html

Tags:

python

html-parsing

shau-kote

Video Answer

1 Answers

Anorov

Recent Activity

Donate For Us