Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using python Requests with javascript pages

I am trying to use the Requests framework with python (http://docs.python-requests.org/en/latest/) but the page I am trying to get to uses javascript to fetch the info that I want.

I have tried to search on the web for a solution but the fact that I am searching with the keyword javascript most of the stuff I am getting is how to scrape with the javascript language.

Is there anyway to use the requests framework with pages that use javascript?

like image 250
biw Avatar asked Oct 15 '14 22:10

biw


People also ask

Can you use Python and JavaScript together?

Answer: No, Python cannot replace Javascript. In fact, the two languages complement each other. Javascript is used as a client-side scripting language, whereas Python is mostly used as a server-side scripting language.

Can BeautifulSoup render JavaScript?

Alternatively, we could also use BeautifulSoup on the rendered HTML (see below). However, the awesome point here is that we can create the connection to this webpage, render its JavaScript, and parse out the resultant HTML all in one package!


Video Answer


1 Answers

Good news: there is now a requests module that supports javascript: https://pypi.org/project/requests-html/

from requests_html import HTMLSession  session = HTMLSession()  r = session.get('http://www.yourjspage.com')  r.html.render()  # this call executes the js in the page 

As a bonus this wraps BeautifulSoup, I think, so you can do things like

r.html.find('#myElementID').text 

which returns the content of the HTML element as you'd expect.

like image 96
marvb Avatar answered Oct 15 '22 13:10

marvb