Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beautifulsoup results to pandas dataframe

The below code returns me a table with the following results

r = requests.get(url)
soup = bs4.BeautifulSoup(r.text, 'lxml')

mylist = soup.find(attrs={'class': 'table_grey_border'})
print(mylist)

results - it stretches on for 1700 rows

<table cellpadding="0" cellspacing="2" class="table_grey_border" width="100%">
<tr valign="top">
<td class="verd_black12" width="18%"><b>STOCK CODE</b></td>
<td class="verd_black12" width="42%"><b>NAME OF LISTED SECURITIES</b></td>
<td class="verd_black12" width="19%"><b>BOARD LOT</b></td>
<td class="verd_black12" colspan="4" width="12%"><b>REMARK</b></td>
</tr>
<tr class="tr_normal">
<td class="verd_black12" width="18%">00001</td>
<td class="verd_black12" width="42%"><a href="../../../invest/company/profile_page_e.asp?WidCoID=00001&amp;WidCoAbbName=&amp;Month=&amp;langcode=e" target="_parent">CKH HOLDINGS</a></td>
<td class="verd_black12" width="19%">500</td>
<td align="center" class="verd_black12" width="3%">#</td>
<td align="center" class="verd_black12" width="3%">H</td>
<td align="center" class="verd_black12" width="3%">O</td>
<td align="center" class="verd_black12" width="3%">F</td>
</tr>
<tr class="tr_normal">
<td class="verd_black12" width="18%">00002</td>
<td class="verd_black12" width="42%"><a href="../../../invest/company/profile_page_e.asp?WidCoID=00002&amp;WidCoAbbName=&amp;Month=&amp;langcode=e" target="_parent">CLP HOLDINGS</a></td>
<td class="verd_black12" width="19%">500</td>
<td align="center" class="verd_black12" width="3%">#</td>
<td align="center" class="verd_black12" width="3%">H</td>
<td align="center" class="verd_black12" width="3%">O</td>
<td align="center" class="verd_black12" width="3%">F</td>
</tr>
...

My question is, how do I put each of these rows into Pandas Dataframe? I tried the below code, but i'm returned with an error

a = pandas.read_html(mylist)
print(a)

error

TypeError: 'NoneType' object is not callable
like image 360
jake wong Avatar asked Feb 05 '17 10:02

jake wong


1 Answers

Document:

pandas.read_html(url, attrs={'class': 'table_grey_border'})
like image 197
宏杰李 Avatar answered Oct 10 '22 09:10

宏杰李