I've used selenium to initiate a download. After the download is complete, certain actions need to be taken, is there any simple method to find out when a download has complete? (I am using the FireFox driver)
To handle a PDF document in Selenium test automation, we can use a java library called PDFBox. Apache PDFBox is an open-source library that exclusively helps in handling the PDF documents. We can use it to verify the text present in the document, extract a specific section of text or image in the documents, and so on.
You should create a checksum (an MD5 Sum, or SHA1 Sum) for the file on the server. Then after the download, run a the same checksum and the two values need to match. If you are downloading via Java, you can use the MessageDigest class to help you generate the digest.
I came across this problem recently. I was downloading multiple files at once and had to build in a way to timeout if the downloads failed.
The code checks the filenames in some download directory every second and exits once they are complete or if it takes longer than 20 seconds to finish. The returned download time was used to check if the downloads were successful or if it timed out.
import time import os def download_wait(path_to_downloads): seconds = 0 dl_wait = True while dl_wait and seconds < 20: time.sleep(1) dl_wait = False for fname in os.listdir(path_to_downloads): if fname.endswith('.crdownload'): dl_wait = True seconds += 1 return seconds
I believe that this only works with chrome files as they end with the .crdownload extension. There may be a similar way to check in other browsers.
Edit: I recently changed the way that I use this function for times that .crdownload
does not appear as the extension. Essentially this just waits for the correct number of files as well.
def download_wait(directory, timeout, nfiles=None): """ Wait for downloads to finish with a specified timeout. Args ---- directory : str The path to the folder where the files will be downloaded. timeout : int How many seconds to wait until timing out. nfiles : int, defaults to None If provided, also wait for the expected number of files. """ seconds = 0 dl_wait = True while dl_wait and seconds < timeout: time.sleep(1) dl_wait = False files = os.listdir(directory) if nfiles and len(files) != nfiles: dl_wait = True for fname in files: if fname.endswith('.crdownload'): dl_wait = True seconds += 1 return seconds
There is no built-in to selenium way to wait for the download to be completed.
The general idea here would be to wait until a file would appear in your "Downloads" directory.
This might either be achieved by looping over and over again checking for file existence:
Or, by using things like watchdog
to monitor a directory:
import os
import time
def latest_download_file():
path = r'Downloads folder file path'
os.chdir(path)
files = sorted(os.listdir(os.getcwd()), key=os.path.getmtime)
newest = files[-1]
return newest
fileends = "crdownload"
while "crdownload" == fileends:
time.sleep(1)
newest_file = latest_download_file()
if "crdownload" in newest_file:
fileends = "crdownload"
else:
fileends = "none"
This is a combination of a few solutions. I didn't like that I had to scan the entire downloads folder for a file ending in "crdownload". This code implements a function that pulls the newest file in downloads folder. Then it simply checks if that file is still being downloaded. Used it for a Selenium tool I am building worked very well.
I know its too late for the answer, though would like to share a hack for future readers.
You can create a thread say thread1 from main thread and initiate your download here. Now, create some another thread, say thread2 and in there ,let it wait till thread1 completes using join() method.Now here,you can continue your flow of execution after download completes.
Still make sure you dont initiate your download using selenium, instead extract the link using selenium and use requests module to download.
Download using requests module
For eg:
def downloadit():
#download code here
def after_dwn():
dwn_thread.join() #waits till thread1 has completed executing
#next chunk of code after download, goes here
dwn_thread = threading.Thread(target=downloadit)
dwn_thread.start()
metadata_thread = threading.Thread(target=after_dwn)
metadata_thread.start()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With