How to parse html table in python

Question

I'm newbie in parsing tables and regular expressions, can you help to parse this in python:

<table callspacing="0" cellpadding="0">
    <tbody><tr>
    <td>1text&nbsp;2text</td>
    <td>3text&nbsp;</td>
    </tr>
    <tr>
    <td>4text&nbsp;5text</td>
    <td>6text&nbsp;</td>
    </tr>
</tbody></table>

I need the "3text" and "6text"

Humayun Ahmad Rajib · Accepted Answer

You can use CSS selector select() and select_one() to get "3text" and "6text" like below:

import requests
from bs4 import BeautifulSoup
html_doc='''
<table callspacing="0" cellpadding="0">
    <tbody><tr>
    <td>1text&nbsp;2text</td>
    <td>3text&nbsp;</td>
    </tr>
    <tr>
    <td>4text&nbsp;5text</td>
    <td>6text&nbsp;</td>
    </tr>
</tbody></table>
'''

soup = BeautifulSoup(html_doc, 'lxml')
soup1 = soup.select('tr')

for i in soup1:
    print(i.select_one('td:nth-child(2)').text)

You can also use find_all method:

trs = soup.find('table').find_all('tr')

for i in trs:
    tds = i.find_all('td')
    print(tds[1].text)

Result:

3text 
6text

Saba · Answer

best way is to use beautifulsoup

from bs4 import BeautifulSoup

html_doc='''
<table callspacing="0" cellpadding="0">
    <tbody><tr>
    <td>1text&nbsp;2text</td>
    <td>3text&nbsp;</td>
    </tr>
    <tr>
    <td>4text&nbsp;5text</td>
    <td>6text&nbsp;</td>
    </tr>
</tbody></table>
'''


soup = BeautifulSoup(html_doc, "html.parser")

# finds all tr tags
for i in soup.find_all("tr"):
    # finds all td tags in tr tags
    for k in i.find_all("td"):
        # prints all td tags with a text format
        print(k.text)

in this case it prints

1text 2text
3text 
4text 5text
6text

but you can grab the texts you want with indexing. In this case you could just go with

# finds all tr tags
for i in soup.find_all("tr"):
    # finds all td tags in tr tags
    print(i.find_all("td")[1].text)

How to parse html table in python

Tags:

python

html-table

beautifulsoup

RoyalGoose

2 Answers

Humayun Ahmad Rajib

Saba

Recent Activity

Donate For Us

How to parse html table in python

Tags:

python

html-table

beautifulsoup

RoyalGoose

2 Answers

Humayun Ahmad Rajib

Saba

Related questions

Recent Activity

Donate For Us