Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Selenium in Python to save a webpage on Firefox

I am trying to use Selenium in Python to save webpages on MacOS Firefox.

So far, I have managed to click COMMAND + S to pop up the SAVE AS window. However,

I don't know how to:

  1. change the directory of the file,
  2. change the name of the file, and
  3. click the SAVE AS button.

Could someone help?

Below is the code I have use to click COMMAND + S:

ActionChains(browser).key_down(Keys.COMMAND).send_keys("s").key_up(Keys.COMMAND).perform()

Besides, the reason for me to use this method is that I encounter Unicode Encode Error when I :-

  1. write the page_source to a html file and
  2. store scrapped information to a csv file.

Write to a html file:

file_object = open(completeName, "w")
html = browser.page_source
file_object.write(html)
file_object.close() 

Write to a csv file:

csv_file_write.writerow(to_write)

Error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in position 1: ordinal not in range(128)

like image 936
Tommy N Avatar asked Jun 15 '16 12:06

Tommy N


People also ask

How do I save a webpage in Selenium Python?

To save a page we shall first obtain the page source behind the webpage with the help of the page_source method. We shall open a file with a particular encoding with the codecs. open method. The file has to be opened in the write mode represented by w and encoding type as utf−8.

Can I use Selenium with Firefox?

Selenium IDE by SeleniumIt is implemented as a Firefox extension, and allows you to record, edit, and debug tests.

How do I run Firefox in Selenium Python?

To make Firefox work with Python selenium, you need to install the geckodriver. The geckodriver driver will start the real firefox browser and supports Javascript. Take a look at the selenium firefox code. First import the webdriver, then make it start firefox.


3 Answers

with open('page.html', 'w') as f:
    f.write(driver.page_source)
like image 52
misantroop Avatar answered Oct 03 '22 06:10

misantroop


What you are trying to achieve is impossible to do with Selenium. The dialog that opens is not something Selenium can interact with.

The closes thing you could do is collect the page_source which gives you the entire HTML of a single page and save this to a file.

import codecs

completeName = os.path.join(save_path, file_name)
file_object = codecs.open(completeName, "w", "utf-8")
html = browser.page_source
file_object.write(html)

If you really need to save the entire website you should look into using a tool like AutoIT. This will make it possible to interact with the save dialog.

like image 42
RemcoW Avatar answered Oct 03 '22 06:10

RemcoW


You cannot interact with system dialogs like save file dialog. If you want to save the page html you can do something like this:

page = driver.page_source
file_ = open('page.html', 'w')
file_.write(page)
file_.close()
like image 5
Mobrockers Avatar answered Oct 03 '22 05:10

Mobrockers