Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download file through Google Chrome in headless mode

I'm do me code in Cromedrive in 'normal' mode and works fine. When I change to headless mode it don't download the file. I already try the code I found alround internet, but didn't work.

chrome_options = Options()
chrome_options.add_argument("--headless")
self.driver = webdriver.Chrome(chrome_options=chrome_options, executable_path=r'{}/chromedriver'.format(os.getcwd()))
self.driver.set_window_size(1024, 768)
self.driver.command_executor._commands["send_command"] = ("POST", '/session/$sessionId/chromium/send_command')

params = {'cmd': 'Page.setDownloadBehavior', 'params': {'behavior': 'allow', 'downloadPath': os.getcwd()}}
self.driver.execute("send_command", params)

Anyone have any idea about how solve this problem?

PS: I don't need to use Chomedrive necessarily. If it works in another drive it's fine for me.

like image 274
CBury Avatar asked Aug 21 '19 22:08

CBury


People also ask

What is headless download?

Headless Chrome is shipping in Chrome 59. It's a way to run the Chrome browser in a headless environment. Essentially, running Chrome without chrome! It brings all modern web platform features provided by Chromium and the Blink rendering engine to the command line.

How do I run Chrome in headless mode?

Which command starts the google chrome web browser in headless mode? As we have already seen, you just have to add the flag –headless when you launch the browser to be in headless mode. With CLI (Command Line Interface), just write: chrome \<br> – headless \ # Runs Chrome in headless mode.

How can I force Chrome to only download one file at a time?

In the Privacy and security section, select Content settings. Select Automatic downloads, and then turn on Do not allow any site to download multiple files automatically. Chrome will now ask your permission before downloading multiple files.


3 Answers

For javascript use below code:

    const chrome = require('selenium-webdriver/chrome');
    let options = new chrome.Options();
    options.addArguments('--headless --window-size=1500,1200');
    options.setUserPreferences({ 'plugins.always_open_pdf_externally': true,
    "profile.default_content_settings.popups": 0,
    "download.default_directory": Download_File_Path });
    driver = await new webdriver.Builder().setChromeOptions(options).forBrowser('chrome').build();

Then switch tabs as soon as you click the download button:

    await driver.sleep(1000); 
    var Handle = await driver.getAllWindowHandles();
    await driver.switchTo().window(Handle[1]);
like image 24
Justin Chetty Avatar answered Oct 17 '22 08:10

Justin Chetty


First the solution

Minimum Prerequisites:

  • Selenium client version: Selenium v3.141.59
  • Chrome version: Chrome v77.0
  • ChromeDriver version: ChromeDriver v77.0

To download the file clicking on the element with text as Download Data within this website you can use the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.chrome.options import Options
    
    options = Options()
    options.add_argument("--headless")
    options.add_argument("--window-size=1920,1080")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe', service_args=["--log-path=./Logs/DubiousDan.log"])
    print ("Headless Chrome Initialized")
    params = {'behavior': 'allow', 'downloadPath': r'C:\Users\Debanjan.B\Downloads'}
    driver.execute_cdp_cmd('Page.setDownloadBehavior', params)
    driver.get("https://www.mockaroo.com/")
    driver.execute_script("scroll(0, 250)"); 
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#download"))).click()
    print ("Download button clicked")
    #driver.quit()
    
  • Console Output:

    Headless Chrome Initialized
    Download button clicked
    
  • File Downloading snapshot:

ChromeHeadlessDownload


Details

Downloading files through Headless Chromium was one of the most sought functionality since Headless Chrome was introduced.

Since then there were different work-arounds published by different contributors and some of them are:

  • Downloading with chrome headless and selenium
  • Python equivalent of a given wget command

Now the, the good news is Chromium team have officially announced the arrival of the functionality Downloading file through Headless Chromium.


In the discussion Headless mode doesn't save file downloads @eseckler mentioned:

Downloads in headless work a little differently. There's the Page.setDownloadBehavior devtools command to set a download folder. We're working on a way to use DevTools network interception to stream the downloaded file via DevTools as well.

A detailed discussion can be found at Issue 696481: Headless mode doesn't save file downloads

Finally, @bugdroid revision seems to have nailed the issue for us.


[ChromeDriver] Added support for headless mode to download files

Previously, Chromedriver running in headless mode would not properly download files due to the fact it sparsely parses the preference file given to it. Engineers from the headless chrome team recommended using DevTools's "Page.setDownloadBehavior" to fix this. This changelist implements this fix. Downloaded files default to the current directory and can be set using download_dir when instantiating a chromedriver instance. Also added tests to ensure proper download functionality.

Here is the revision and commit

From ChromeDriver v77.0.3865.40 (2019-08-20) release notes:

Resolved issue 2454: Headless mode doesn't save file downloads [Pri-2]

Solution

  • Update ChromeDriver to latest ChromeDriver v77.0 level.
  • Update Chrome to Chrome Version 77.0 level. (as per ChromeDriver v76.0 release notes)
  • Note: Chrome v77.0 is yet to be GAed/pushed for release so till then you can download and install a development build and test either from:

    • Chrome Canary
    • Latest build from the Dev Channel

Outro

However Mac OSX users have a wait for their pie as On Chromedriver, headless chrome crashes after sending Page.setDownloadBehavior on MacOSX.

like image 173
undetected Selenium Avatar answered Oct 17 '22 08:10

undetected Selenium


Chomedriver Version: 95.0.4638.54
Chrome Version 95.0.4638.69

    from selenium.webdriver.chrome.options import Options    
 
    options = Options()
    options.add_argument("--headless")
    options.add_argument("--start-maximized")
    options.add_argument("--no-sandbox")
    options.add_argument("--disable-extensions")
    options.add_argument('--disable-dev-shm-usage')    
    options.add_argument("--disable-gpu")
    options.add_argument('--disable-software-rasterizer')
    options.add_argument("user-agent=Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 640 XL LTE) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Mobile Safari/537.36 Edge/12.10166")
    options.add_argument("--disable-notifications")

    options.add_experimental_option("prefs", {
        "download.default_directory": "C:\\link\\to\\folder",
        "download.prompt_for_download": False,
        "download.directory_upgrade": True,
        "safebrowsing_for_trusted_sources_enabled": False,
        "safebrowsing.enabled": False
        }
    )

What seemed to work was that I used "\\" instead of "/" for the address. The latter approach didn't throw any error, but didn't download any documents either. But, using double back slashes did the job.

like image 5
Rahil Kadakia Avatar answered Oct 17 '22 08:10

Rahil Kadakia