Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get page generated with Javascript in Python

I'd like to download web page generated by Javascript and store it to string variable in Python code. The page is generated when you click on button.

If I would know the resulting URL I would use urllib2 but this is not the case.

thank you

like image 806
xralf Avatar asked Jan 22 '12 09:01

xralf


People also ask

Does Beautifulsoup work with JavaScript?

Beautiful Soup doesn't mimic a client. Javascript is code that runs on the client. With Python, we simply make a request to the server, and get the server's response, which is the starting text, along of course with the javascript, but it's the browser that reads and runs that javascript.

Can I use JavaScript with Python?

JS2PY works by translating JavaScript directly into Python. It indicates that you may run JS directly from Python code without installing large external engines like V8. To use the module it first has to be installed into the system, since it is not built-in. To use the module it has to be imported.


1 Answers

You could use Selenium Webdriver:

#!/usr/bin/env python from contextlib import closing from selenium.webdriver import Firefox # pip install selenium from selenium.webdriver.support.ui import WebDriverWait  # use firefox to get page with javascript generated content with closing(Firefox()) as browser:      browser.get(url)      button = browser.find_element_by_name('button')      button.click()      # wait for the page to load      WebDriverWait(browser, timeout=10).until(          lambda x: x.find_element_by_id('someId_that_must_be_on_new_page'))      # store it to string variable      page_source = browser.page_source print(page_source) 
like image 77
jfs Avatar answered Oct 08 '22 14:10

jfs