Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beautiful Soup find children for particular div

I have am trying to parse a webpage that looks like this with Python->Beautiful Soup:enter image description here

I am trying to extract the contents of the highlighted td div. Currently I can get all the divs by

alltd = soup.findAll('td')      for td in alltd:     print td 

But I am trying to narrow the scope of that to search the tds in the class "tablebox" which still will probably return 30+ but is more managable a number than 300+.

How can I extract the contents of the highlighted td in picture above?

like image 947
Nick Avatar asked Nov 02 '12 19:11

Nick


People also ask

How do you identify elements from beautiful soup?

BeautifulSoup has a limited support for CSS selectors, but covers most commonly used ones. Use select() method to find multiple elements and select_one() to find a single element.

How do you find a href in BeautifulSoup?

To get href with Python BeautifulSoup, we can use the find_all method. to create soup object with BeautifulSoup class called with the html string. Then we find the a elements with the href attribute returned by calling find_all with 'a' and href set to True .


1 Answers

It is useful to know that whatever elements BeautifulSoup finds within one element still have the same type as that parent element - that is, various methods can be called.

So this is somewhat working code for your example:

soup = BeautifulSoup(html) divTag = soup.find_all("div", {"class": "tablebox"})  for tag in divTag:     tdTags = tag.find_all("td", {"class": "align-right"})     for tag in tdTags:         print tag.text 

This will print all the text of all the td tags with the class of "align-right" that have a parent div with the class of "tablebox".

like image 109
Bo Milanovich Avatar answered Sep 25 '22 15:09

Bo Milanovich