Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse/extract table data using python

Tags:

python

html

<html> 
<table border="1px"> 
<tr>
<td>yes</td>
<td>no</td>
</tr>
</table>
</html>

Is there any way to get the contents of the table (yes ,no) besides beautifulsoup??

A python beginner,any help or any kind of direction will be of great help.

Thank you

like image 469
Php Beginner Avatar asked Jul 14 '11 08:07

Php Beginner


1 Answers

You can use the HTMLParser module that comes with the Python standard library.

>>> import HTMLParser
>>> data = '''
... <html> 
... <table border="1px"> 
... <tr>
... <td>yes</td>
... <td>no</td>
... </tr>
... </table>
... </html>
... '''
>>> class TableParser(HTMLParser.HTMLParser):
...     def __init__(self):
...         HTMLParser.HTMLParser.__init__(self)
...         self.in_td = False
...     
...     def handle_starttag(self, tag, attrs):
...         if tag == 'td':
...             self.in_td = True
...     
...     def handle_data(self, data):
...         if self.in_td:
...             print data
...     
...     def handle_endtag(self, tag):
...         self.in_td = False
... 
>>> p = TableParser()
>>> p.feed(data)
yes
no
like image 175
Vasiliy Faronov Avatar answered Oct 17 '22 00:10

Vasiliy Faronov