Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Selenium with Python and PhantomJS to download file to filesystem

Tags:

I've been grappling with using PhantomJS/Selenium/python-selenium to download a file to the filesystem. I'm able to easily navigate through the DOM and click, hover etc. Downloading a file is, however, proving to be quite troublesome. I've tried a headless approach with Firefox and pyvirtualdisplay but that wasn't working well either and was unbelievably slow. I know That CasperJS allows for file downloads. Does anyone know how to integrate CasperJS with Python or how to utilize PhantomJS to download files. Much appreciated.

like image 766
Encinoman818 Avatar asked Sep 10 '14 00:09

Encinoman818


People also ask

Does Selenium support PhantomJS?

"Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead" #48.

Can Selenium download files?

If you are an automation tester fixated on Selenium automation testing, the chances are that you might run into a requirement of testing a feature around downloading files. While being a powerful tool for performing automated browser testing, Selenium natively doesn't support download functionality.

What is PhantomJS in Selenium?

PhantomJS is a headless Webkit, which has a number of uses. In this example, we'll be using it, in conjunction with Selenium WebDriver, for conducting basic system tests directly from the command line. Since PhantomJS eliminates the need for a graphical browser, tests run much faster.


1 Answers

Despite this question is quite old, downloading files through PhantomJS is still a problem. But we can use PhantomJS to get download link and fetch all needed cookies such as csrf tokens and so on. And then we can use requests to download it actually:

import requests from selenium import webdriver  driver = webdriver.PhantomJS() driver.get('page_with_download_link') download_link = driver.find_element_by_id('download_link') session = requests.Session() cookies = driver.get_cookies()  for cookie in cookies:      session.cookies.set(cookie['name'], cookie['value']) response = session.get(download_link) 

And now in response.content actual file content should appear. We can next write it with open or do whatever we want.

like image 151
valignatev Avatar answered Oct 24 '22 01:10

valignatev