How do you get all the rows from a particular table using BeautifulSoup?

Tags:

I am learning Python and BeautifulSoup to scrape data from the web, and read a HTML table. I can read it into Open Office and it says that it is Table #11.

It seems like BeautifulSoup is the preferred choice, but can anyone tell me how to grab a particular table and all the rows? I have looked at the module documentation, but can't get my head around it. Many of the examples that I have found online appear to do more than I need.

215

asked Jan 06 '10 01:01

Btibert3

1 Answers

This should be pretty straight forward if you have a chunk of HTML to parse with BeautifulSoup. The general idea is to navigate to your table using the findChildren method, then you can get the text value inside the cell with the string property.

>>> from BeautifulSoup import BeautifulSoup >>>  >>> html = """ ... <html> ... <body> ...     <table> ...         <th><td>column 1</td><td>column 2</td></th> ...         <tr><td>value 1</td><td>value 2</td></tr> ...     </table> ... </body> ... </html> ... """ >>> >>> soup = BeautifulSoup(html) >>> tables = soup.findChildren('table') >>> >>> # This will get the first (and only) table. Your page may have more. >>> my_table = tables[0] >>> >>> # You can find children with multiple tags by passing a list of strings >>> rows = my_table.findChildren(['th', 'tr']) >>> >>> for row in rows: ...     cells = row.findChildren('td') ...     for cell in cells: ...         value = cell.string ...         print("The value in this cell is %s" % value) ...  The value in this cell is column 1 The value in this cell is column 2 The value in this cell is value 1 The value in this cell is value 2 >>>

137

answered Sep 19 '22 13:09

JJ Geewax

Related questions
                            
                                What is wrong with this nested WHILE loop in SQL
                            
                                Debugging an IEnumerable method
                            
                                How do I save an NSString as a .txt file on my apps local documents directory?
                            
                                Python Fabric: How to answer to keyboard input?
                            
                                OOP design problem
                            
                                How do I put a question mark above \leq?
                            
                                Zooming and panning svg images using raphael.js or some other js library [closed]
                            
                                How should I mark the end of a TCP packet?
                            
                                How to remove focus from submit button
                            
                                Reading XML using XDocument & Linq - check if element is NULL?
                            
                                Replace all spaces and special symbols with dash in URL using PHP language
                            
                                How to get website's physical path on local IIS server? (from a desktop app)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With