Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beautifulsoup - nextSibling

I'm trying to get the content "My home address" using the following but got the AttributeError:

address = soup.find(text="Address:")
print address.nextSibling

This is my HTML:

<td><b>Address:</b></td>
<td>My home address</td>

What is a good way to navigate down td tag and pull the content?

like image 825
ready Avatar asked May 14 '11 04:05

ready


People also ask

How do I get a new sibling in BeautifulSoup?

find_next_siblings() function is used to find all the next siblings of a tag / element. It returns all the next siblings that match. Find Next Sibling: find_next_sibling() function is used to find the succeeding sibling of a tag/element.


Video Answer


3 Answers

The problem is that you have found a NavigableString, not the <td>. Also nextSibling will find the next NavigableString or Tag so even if you had the <td> it wouldn't work the way you expect.

This is what you want:

address = soup.find(text="Address:") b_tag = address.parent td_tag = b_tag.parent next_td_tag = td_tag.findNext('td') print next_td_tag.contents[0] 

Or more concise:

print soup.find(text="Address:").parent.parent.findNext('td').contents[0] 

Actually you could just do

print soup.find(text="Address:").findNext('td').contents[0] 

Since findNext just calls next over and over again, and next finds the next element as parsed repeatedly until it matches.

like image 200
Henry Avatar answered Sep 20 '22 10:09

Henry


Try this if you use bs4:

print soup.find(string="Address:").find_next('td').contents[0]
like image 26
Vyachez Avatar answered Sep 22 '22 10:09

Vyachez


I don't know if this was possible in 2011 but in 2021 I'd recommend you to do it using find_next_sibling() like this:

address = soup.find(text="Address:")
b = address.parent
address_td = b.parent
target_td = address_td.find_next_sibling('td')

The accepted answer works in your case but it would not work if you had something like:

<div>
  <div><b>Address:</b><div>THE PROBLEM</div></div>
  <div>target</div>
</div>

You'd end up with <div>THE PROBLEM</div> instead of <div>target</div>.

like image 44
Stefan Falk Avatar answered Sep 19 '22 10:09

Stefan Falk