Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to read table from website using Beautifulsoup

I am trying to read a website's content using below code.

import requests
from bs4 import BeautifulSoup

url  = "https://chartink.com/screener/test-121377" 
r    = requests.get(url)
data = r.text
soup = BeautifulSoup(data,"html.parser")

print(soup)

In the result, I am unable to see the the table which I could see when I do "Inspect" element manually in the browser.

enter image description here

Using selenium could be one solution. But I am looking for some other alternate solutions, if possible.

Any idea on how to read the data from underlying script in HTML?

like image 909
leaf Avatar asked Mar 18 '18 16:03

leaf


People also ask

Does BeautifulSoup handle broken HTML?

BeautifulSoup is a Python package that parses broken HTML, just like lxml supports it based on the parser of libxml2.

Does BeautifulSoup support XPath?

Nope, BeautifulSoup, by itself, does not support XPath expressions.


1 Answers

In that case you should try out newly released requests_html library which has the capability to handle dynamically generated items. This is how your script should look like if you comply with what I have just said:

import requests_html

session = requests_html.HTMLSession()
r = session.get('https://chartink.com/screener/test-121377')
r.html.render(sleep=5)
items = r.html.find("table#DataTables_Table_0",first=True)
for item in items.find("tr"):
    data = [td.text for td in item.find("th,td")]
    print(data)

Output:

['Sr.', 'Stock Name', 'Symbol', 'Links', '% Chg', 'Price', 'Volume']
['1', 'Axis Bank Limited', 'AXISBANK', 'P&F | F.A', '-1.33%', '522.6', '12,146,623']
['2', 'Reliance Industries Limited', 'RELIANCE', 'P&F | F.A', '-1.29%', '900.05', '14,087,564']
['3', 'Tata Steel Limited', 'TATASTEEL', 'P&F | F.A', '-1.89%', '600.2', '11,739,582']
like image 82
SIM Avatar answered Sep 22 '22 13:09

SIM