Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Selenium (Python) - waiting for a download process to complete using Chrome web driver

I'm using selenium and python via chromewebdriver (windows) in order to automate a task of downloading large amount of files from different pages. My code works, but the solution is far from ideal: the function below clicks on the website button that initiating a java script function that generating a PDF file and then downloading it.

I had to use a static wait in order to wait for the download to be completed (ugly) I cannot check the file system in order to verify when the download is completed since i'm using multi threading (downloading lot's of files from different pages at once) and also the the name of the files is generated dynamically in the website itself.

My code:

def file_download(num, drivervar):
Counter += 1
        download_button = WebDriverWait(drivervar, 20).until(EC.element_to_be_clickable((By.ID, 'download button ID')))
    except TimeoutException: # Retry once
        print('Timeout in thread number: ' + str(num) + ', retrying...')

Is it possible to determine download completion in webdriver? I want to avoid using time.sleep(x).

Thanks a lot.

like image 971
BlackMamba Avatar asked Jan 15 '18 12:01


People also ask

How wait till download is complete in Selenium?

There is no built-in to selenium way to wait for the download to be completed. The general idea here would be to wait until a file would appear in your "Downloads" directory.

2 Answers

You can get the status of each download by visiting chrome://downloads/ with the driver.

To wait for all the downloads to finish and to list all the paths:

def every_downloads_chrome(driver):
    if not driver.current_url.startswith("chrome://downloads"):
    return driver.execute_script("""
        var items = document.querySelector('downloads-manager')
        if (items.every(e => e.state === "COMPLETE"))
            return items.map(e => e.fileUrl || e.file_url);

# waits for all the files to be completed and returns the paths
paths = WebDriverWait(driver, 120, 1).until(every_downloads_chrome)

Was updated to support changes till version 81.

like image 157
Florent B. Avatar answered Sep 19 '22 11:09

Florent B.

I have had the same problem and found a solution. You can check weither or not a .crdownload is in your download folder. If there are 0 instances of a file with .crdownload extension in the download folder then all your downloads are completed. This only works for chrome and chromium i think.

def downloads_done():
    while True:
        for filename in os.listdir("/downloads"):
            if ".crdownload" in i:

Whenever you call downloads_done() it will loop itself untill all downloads are completed. If you are downloading massive files like 80 gigabytes then i don't recommend this because then the function can reach maximum recursion depth.

2020 edit:

def wait_for_downloads():
    print("Waiting for downloads", end="")
    while any([filename.endswith(".crdownload") for filename in 
        print(".", end="")

The "end" keyword argument in print() usually holds a newline but we replace it. While there are no filenames in the /downloads folder that end with .crdownload sleep for 2 seconds and print one dot without newline to console

I don't really recommend using selenium anymore after finding out about requests but if it's a very heavily guarded site with cloudflare and captchas etc then you might have to resort to selenium.

like image 33
Walter Randomness Avatar answered Sep 23 '22 11:09

Walter Randomness