python selenium, find out when a download has completed?

Tags:

I've used selenium to initiate a download. After the download is complete, certain actions need to be taken, is there any simple method to find out when a download has complete? (I am using the FireFox driver)

797

asked Dec 17 '15 15:12

applecider

4 Answers

I came across this problem recently. I was downloading multiple files at once and had to build in a way to timeout if the downloads failed.

The code checks the filenames in some download directory every second and exits once they are complete or if it takes longer than 20 seconds to finish. The returned download time was used to check if the downloads were successful or if it timed out.

import time import os  def download_wait(path_to_downloads):     seconds = 0     dl_wait = True     while dl_wait and seconds < 20:         time.sleep(1)         dl_wait = False         for fname in os.listdir(path_to_downloads):             if fname.endswith('.crdownload'):                 dl_wait = True         seconds += 1     return seconds

I believe that this only works with chrome files as they end with the .crdownload extension. There may be a similar way to check in other browsers.

Edit: I recently changed the way that I use this function for times that .crdownload does not appear as the extension. Essentially this just waits for the correct number of files as well.

def download_wait(directory, timeout, nfiles=None):     """     Wait for downloads to finish with a specified timeout.      Args     ----     directory : str         The path to the folder where the files will be downloaded.     timeout : int         How many seconds to wait until timing out.     nfiles : int, defaults to None         If provided, also wait for the expected number of files.      """     seconds = 0     dl_wait = True     while dl_wait and seconds < timeout:         time.sleep(1)         dl_wait = False         files = os.listdir(directory)         if nfiles and len(files) != nfiles:             dl_wait = True          for fname in files:             if fname.endswith('.crdownload'):                 dl_wait = True          seconds += 1     return seconds

185

answered Sep 25 '22 01:09

Austin Mackillop

There is no built-in to selenium way to wait for the download to be completed.

The general idea here would be to wait until a file would appear in your "Downloads" directory.

This might either be achieved by looping over and over again checking for file existence:

Check and wait until a file exists to read it

Or, by using things like watchdog to monitor a directory:

How to watch a directory for changes?
Monitoring contents of files/directories?

answered Sep 23 '22 01:09

alecxe

import os
import time

def latest_download_file():
      path = r'Downloads folder file path'
      os.chdir(path)
      files = sorted(os.listdir(os.getcwd()), key=os.path.getmtime)
      newest = files[-1]

      return newest

fileends = "crdownload"
while "crdownload" == fileends:
    time.sleep(1)
    newest_file = latest_download_file()
    if "crdownload" in newest_file:
        fileends = "crdownload"
    else:
        fileends = "none"

This is a combination of a few solutions. I didn't like that I had to scan the entire downloads folder for a file ending in "crdownload". This code implements a function that pulls the newest file in downloads folder. Then it simply checks if that file is still being downloaded. Used it for a Selenium tool I am building worked very well.

answered Sep 21 '22 01:09

Red

I know its too late for the answer, though would like to share a hack for future readers.

You can create a thread say thread1 from main thread and initiate your download here. Now, create some another thread, say thread2 and in there ,let it wait till thread1 completes using join() method.Now here,you can continue your flow of execution after download completes.

Still make sure you dont initiate your download using selenium, instead extract the link using selenium and use requests module to download.

Download using requests module

For eg:

def downloadit():
     #download code here    

def after_dwn():
     dwn_thread.join()           #waits till thread1 has completed executing
     #next chunk of code after download, goes here

dwn_thread = threading.Thread(target=downloadit)
dwn_thread.start()

metadata_thread = threading.Thread(target=after_dwn)
metadata_thread.start()

answered Sep 21 '22 01:09

Dhyey Shah

Related questions
                            
                                ValueError: Variable rnn/basic_rnn_cell/kernel already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?
                            
                                How can I list the methods in a Python 2.5 module?
                            
                                Tracking white color using python opencv
                            
                                Execute a command on Remote Machine in Python
                            
                                How to write data to Redshift that is a result of a dataframe created in Python?
                            
                                Multiprocessing: use only the physical cores?
                            
                                pymongo : delete records elegantly
                            
                                executing Python script in PHP and exchanging data between the two
                            
                                How to set a Python variable to 'undefined'?
                            
                                No module named 'winrandom' when using pycrypto
                            
                                python csv, writing headers only once
                            
                                How to use TailwindCSS with Django?
                            
                                Sweave for python
                            
                                How to get the duration of video using cv2
                            
                                Beautiful Soup to parse url to get another urls data
                            
                                Pythonic Circular List
                            
                                Nested dictionary comprehension python
                            
                                DBSCAN for clustering of geographic location data
                            
                                Docker Kafka w/ Python consumer
                            
                                How to make Django template engine to render in memory templates?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python selenium, find out when a download has completed?

Tags:

python

selenium