Saving full page content using Selenium

Question

I was wondering what's the best way to save all the files that are retrieved when Selenium visits a site. In other words, when Selenium visits http://www.google.com I want to save the HTML, JavaScript (including scripts referenced in src tags), images, and potentially content contained in iframes. How can this be done?

I know getHTMLSource() will return the HTML content in the body of the main frame, but how can this be extended to download the complete set of files necessary to render that page again. Thanks in advance!

pAulseperformance · Accepted Answer

The only built in method Selenium has for downloading source content is

driver = webdriver.Chrome()
driver.get('www.someurl.com')
page_source = driver.page_source

But that doesn't download all the images, css, and js scripts like you would get if you used ctrl+s on a webpage. So you'll need to emulate the ctr+s keys after you navigate to a webpage like Algorithmatic has stated.

I made a gist to show how thats done. https://gist.github.com/GrilledChickenThighs/211c307edf8f828806c4bb4e4707b106

# Download entire webpage including all javascript, html, css of webpage. Replicates ctrl+s when on a webpage.

from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

def save_current_page():      
    ActionChains(browser).send_keys(Keys.CONTROL, "s").perform()

Saving full page content using Selenium

Tags:

selenium

Rick

1 Answers

pAulseperformance

Recent Activity

Donate For Us

Saving full page content using Selenium

Tags:

selenium

Rick

1 Answers

pAulseperformance

Related questions

Recent Activity

Donate For Us