How to download a file using the remote selenium webdriver?

Tags:

selenium-webdriver

I am using a remote selenium webdriver to perform some tests. At some point, however, I need to download a file and check its contents.

I am using the remote webdriver as follows (in python):

PROXY = ...

prefs = {
    "profile.default_content_settings.popups":0,
    "download.prompt_for_download": "false",
    "download.default_directory": os.getcwd(),
}
chrome_options = Options()
chrome_options.add_argument("--disable-extensions")
chrome_options.add_experimental_option("prefs", prefs)

webdriver.DesiredCapabilities.CHROME['proxy'] = {
  "httpProxy":PROXY,
  "ftpProxy":PROXY,
  "sslProxy":PROXY,
  "noProxy":None,
  "proxyType":"MANUAL",
  "class":"org.openqa.selenium.Proxy",
  "autodetect":False
}
driver = webdriver.Remote(
        command_executor='http://aaa.bbb.ccc:4444/wd/hub',
        desired_capabilities=DesiredCapabilities.CHROME)

With a 'normal' webdriver I am able to download the file without issues on the local computer. Then I can use the testing code to e.g. verify the content of the downloaded file (which can change depending on test parameters). It is not a test of the download itself, but I need a way to verify the contents of the generated file ...

But how to do that using a remote webdriver? I have not found anything helpful anywhere...

917

asked Nov 02 '17 06:11

Alex

2 Answers

The Selenium API doesn't provide a way to get a file downloaded on a remote machine.

But it's still possible with Selenium alone depending on the browser.

With Chrome the downloaded files can be listed by navigating chrome://downloads/ and retrieved with an injected <input type="file"> in the page :

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import os, time, base64


def get_downloaded_files(driver):

  if not driver.current_url.startswith("chrome://downloads"):
    driver.get("chrome://downloads/")

  return driver.execute_script( \
    "return downloads.Manager.get().items_   "
    "  .filter(e => e.state === 'COMPLETE')  "
    "  .map(e => e.filePath || e.file_path); " )


def get_file_content(driver, path):

  elem = driver.execute_script( \
    "var input = window.document.createElement('INPUT'); "
    "input.setAttribute('type', 'file'); "
    "input.hidden = true; "
    "input.onchange = function (e) { e.stopPropagation() }; "
    "return window.document.documentElement.appendChild(input); " )

  elem._execute('sendKeysToElement', {'value': [ path ], 'text': path})

  result = driver.execute_async_script( \
    "var input = arguments[0], callback = arguments[1]; "
    "var reader = new FileReader(); "
    "reader.onload = function (ev) { callback(reader.result) }; "
    "reader.onerror = function (ex) { callback(ex.message) }; "
    "reader.readAsDataURL(input.files[0]); "
    "input.remove(); "
    , elem)

  if not result.startswith('data:') :
    raise Exception("Failed to get file content: %s" % result)

  return base64.b64decode(result[result.find('base64,') + 7:])



capabilities_chrome = { \
    'browserName': 'chrome',
    # 'proxy': { \
     # 'proxyType': 'manual',
     # 'sslProxy': '50.59.162.78:8088',
     # 'httpProxy': '50.59.162.78:8088'
    # },
    'goog:chromeOptions': { \
      'args': [
      ],
      'prefs': { \
        # 'download.default_directory': "",
        # 'download.directory_upgrade': True,
        'download.prompt_for_download': False,
        'plugins.always_open_pdf_externally': True,
        'safebrowsing_for_trusted_sources_enabled': False
      }
    }
  }

driver = webdriver.Chrome(desired_capabilities=capabilities_chrome)
#driver = webdriver.Remote('http://127.0.0.1:5555/wd/hub', capabilities_chrome)

# download a pdf file
driver.get("https://www.mozilla.org/en-US/foundation/documents")
driver.find_element_by_css_selector("[href$='.pdf']").click()

# list all the completed remote files (waits for at least one)
files = WebDriverWait(driver, 20, 1).until(get_downloaded_files)

# get the content of the first file remotely
content = get_file_content(driver, files[0])

# save the content in a local file in the working directory
with open(os.path.basename(files[0]), 'wb') as f:
  f.write(content)

With Firefox, the files can be directly listed and retrieved by calling the browser API with a script by switching the context :

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import os, time, base64

def get_file_names_moz(driver):
  driver.command_executor._commands["SET_CONTEXT"] = ("POST", "/session/$sessionId/moz/context")
  driver.execute("SET_CONTEXT", {"context": "chrome"})
  return driver.execute_async_script("""
    var { Downloads } = Components.utils.import('resource://gre/modules/Downloads.jsm', {});
    Downloads.getList(Downloads.ALL)
      .then(list => list.getAll())
      .then(entries => entries.filter(e => e.succeeded).map(e => e.target.path))
      .then(arguments[0]);
    """)
  driver.execute("SET_CONTEXT", {"context": "content"})

def get_file_content_moz(driver, path):
  driver.execute("SET_CONTEXT", {"context": "chrome"})
  result = driver.execute_async_script("""
    var { OS } = Cu.import("resource://gre/modules/osfile.jsm", {});
    OS.File.read(arguments[0]).then(function(data) {
      var base64 = Cc["@mozilla.org/scriptablebase64encoder;1"].getService(Ci.nsIScriptableBase64Encoder);
      var stream = Cc['@mozilla.org/io/arraybuffer-input-stream;1'].createInstance(Ci.nsIArrayBufferInputStream);
      stream.setData(data.buffer, 0, data.length);
      return base64.encodeToString(stream, data.length);
    }).then(arguments[1]);
    """, path)
  driver.execute("SET_CONTEXT", {"context": "content"})
  return base64.b64decode(result)

capabilities_moz = { \
    'browserName': 'firefox',
    'marionette': True,
    'acceptInsecureCerts': True,
    'moz:firefoxOptions': { \
      'args': [],
      'prefs': {
        # 'network.proxy.type': 1,
        # 'network.proxy.http': '12.157.129.35', 'network.proxy.http_port': 8080,
        # 'network.proxy.ssl':  '12.157.129.35', 'network.proxy.ssl_port':  8080,      
        'browser.download.dir': '',
        'browser.helperApps.neverAsk.saveToDisk': 'application/octet-stream,application/pdf', 
        'browser.download.useDownloadDir': True, 
        'browser.download.manager.showWhenStarting': False, 
        'browser.download.animateNotifications': False, 
        'browser.safebrowsing.downloads.enabled': False, 
        'browser.download.folderList': 2,
        'pdfjs.disabled': True
      }
    }
  }

# launch Firefox
# driver = webdriver.Firefox(capabilities=capabilities_moz)
driver = webdriver.Remote('http://127.0.0.1:5555/wd/hub', capabilities_moz)

# download a pdf file
driver.get("https://www.mozilla.org/en-US/foundation/documents")
driver.find_element_by_css_selector("[href$='.pdf']").click()

# list all the downloaded files (waits for at least one)
files = WebDriverWait(driver, 20, 1).until(get_file_names_moz)

# get the content of the last downloaded file
content = get_file_content_moz(driver, files[0])

# save the content in a local file in the working directory
with open(os.path.basename(files[0]), 'wb') as f:
  f.write(content)

133

answered Oct 08 '22 15:10

Florent B.

Webdriver:

If you are using webdriver, means your code uses the internal Selenium client and server code to communicate with browser instance. And the downloaded files are stored in the local machine which can be directly accessed by using languages like java, python, .Net, node.js, ...

Remote WebDriver [Selenium-Grid]:

If you are using Remote webdriver means you are using GRID concept, The main purpose of the Gird is To distribute your tests over multiple machines or virtual machines (VMs). Form this your code uses Selenium client to communicate with Selenium Grid Server, which passes instruction to the Registered node with the specified browser. Form their Grid Node will pass the instructions form browser-specific driver to browser instance. Here the downloads takes place to the file-system | hard-disk of that system, but users don't have access to the file-system on the virtual machines where the browser is running.

By using javascript if we can access the file, then we can convert the file to base64-String and return to the client code. But for security reasons Javascript will not allow to read files form Disk.

If Selenium Grid hub and node's are in same system, and they are in public Network then you may change the path of the downloaded file to Some of the public downloaded paths like ../Tomcat/webapps/Root/CutrentTimeFolder/file.pdf. By using the public URL you can access the file directly.

For example downloading the file[] from Root folder of tomcat.

System.out.println("FireFox Driver Path « "+ geckodriverCloudRootPath);
File temp = File.createTempFile("geckodriver",  null);
temp.setExecutable(true);
FileUtils.copyURLToFile(new URL( geckodriverCloudRootPath ), temp);

System.setProperty("webdriver.gecko.driver", temp.getAbsolutePath() );
capabilities.setCapability("marionette", true);

If Selenium Grid hub and node are not in same system, the you may not get the downloaded file, because Grid Hub will be in public network[WAN] and Node will in private network[LAN] of the organisation.

You can change browser's downloading files path to a specified folder on hard disk. By using the below code.

String downloadFilepath = "E:\\download";
    
HashMap<String, Object> chromePrefs = new HashMap<String, Object>();
chromePrefs.put("profile.default_content_settings.popups", 0);
chromePrefs.put("download.default_directory", downloadFilepath);
ChromeOptions options = new ChromeOptions();
HashMap<String, Object> chromeOptionsMap = new HashMap<String, Object>();
options.setExperimentalOption("prefs", chromePrefs);
options.addArguments("--test-type");
options.addArguments("--disable-extensions"); //to disable browser extension popup

DesiredCapabilities cap = DesiredCapabilities.chrome();
cap.setCapability(ChromeOptions.CAPABILITY, chromeOptionsMap);
cap.setCapability(CapabilityType.ACCEPT_SSL_CERTS, true);
cap.setCapability(ChromeOptions.CAPABILITY, options);
RemoteWebDriver driver = new ChromeDriver(cap);

@ See

Downloading Files to a Sauce Labs Virtual Machine Prior to Testing
Testing PDF Downloads
Grid HUB Running as a standalone server

answered Oct 08 '22 15:10

Yash

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to download a file using the remote selenium webdriver?

Tags:

selenium

selenium-webdriver

Alex

People also ask

2 Answers

Florent B.

Webdriver:

Remote WebDriver [Selenium-Grid]:

Yash

Recent Activity

Donate For Us

How to download a file using the remote selenium webdriver?

Tags:

selenium

selenium-webdriver

Alex

People also ask

2 Answers

Florent B.

Webdriver:

Remote WebDriver [Selenium-Grid]:

Yash

Related questions

Recent Activity

Donate For Us