Get text outside one tag and inside another

Question

I am parsing a web page with BeautifulSoup, and it has some elements like the following:

<td><font size="2" color="#00009c"><b>Consultant Registration Number  :</b></font>  16043646</td>

The structure always seems to be a <td> with the first part surrounded by <font><b>, and the text after the </font> tag can be empty. How can I get that text that is after the font tag?

In this example I would want to get "16043646". If the html was instead

<td><font size="2" color="#00009c"><b>Consultant Registration Number  :</b></font></td>

I would want to get ""

Shawn Chin · Accepted Answer

>>> from BeautifulSoup import BeautifulSoup
>>> text1 = '<td><font size="2" color="#00009c"><b>Consultant Registration Number  :</b></font>  16043646</td>'
>>> text2 = '<td><font size="2" color="#00009c"><b>Consultant Registration Number  :</b></font></td>'
>>> BeautifulSoup(text1).td.font.nextSibling
u'  16043646'
>>> BeautifulSoup(text2).td.font.nextSibling
>>>

Get text outside one tag and inside another

Tags:

python

html-parsing

beautifulsoup

murgatroid99

1 Answers

Shawn Chin

Recent Activity

Donate For Us

Get text outside one tag and inside another

Tags:

python

html-parsing

beautifulsoup

murgatroid99

1 Answers

Shawn Chin

Related questions

Recent Activity

Donate For Us