Selenium give file name when downloading

People also ask

Can we upload/download file using Selenium?

The most basic way of uploading files in Selenium is using the sendKeys method. It is an inbuilt feature for file upload in Selenium.

How does Selenium verify PDF download?

To handle a PDF document in Selenium test automation, we can use a java library called PDFBox. Apache PDFBox is an open-source library that exclusively helps in handling the PDF documents. We can use it to verify the text present in the document, extract a specific section of text or image in the documents, and so on.

Here is another simple solution, where you can wait until the download completed and then get the downloaded file name from chrome downloads.

Chrome:

# method to get the downloaded file name
def getDownLoadedFileName(waitTime):
    driver.execute_script("window.open()")
    # switch to new tab
    driver.switch_to.window(driver.window_handles[-1])
    # navigate to chrome downloads
    driver.get('chrome://downloads')
    # define the endTime
    endTime = time.time()+waitTime
    while True:
        try:
            # get downloaded percentage
            downloadPercentage = driver.execute_script(
                "return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value")
            # check if downloadPercentage is 100 (otherwise the script will keep waiting)
            if downloadPercentage == 100:
                # return the file name once the download is completed
                return driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content  #file-link').text")
        except:
            pass
        time.sleep(1)
        if time.time() > endTime:
            break

Firefox:

def getDownLoadedFileName(waitTime):
    driver.execute_script("window.open()")
    WebDriverWait(driver,10).until(EC.new_window_is_opened)
    driver.switch_to.window(driver.window_handles[-1])
    driver.get("about:downloads")

    endTime = time.time()+waitTime
    while True:
        try:
            fileName = driver.execute_script("return document.querySelector('#contentAreaDownloadsView .downloadMainArea .downloadContainer description:nth-of-type(1)').value")
            if fileName:
                return fileName
        except:
            pass
        time.sleep(1)
        if time.time() > endTime:
            break

Once you click on the download link/button, just call the above method.

 # click on download link
 browser.find_element_by_partial_link_text("Excel").click()
 # get the downloaded file name
 latestDownloadedFileName = getDownLoadedFileName(180) #waiting 3 minutes to complete the download
 print(latestDownloadedFileName)

JAVA + Chrome:

Here is the method in java.

public String waitUntilDonwloadCompleted(WebDriver driver) throws InterruptedException {
      // Store the current window handle
      String mainWindow = driver.getWindowHandle();
      
      // open a new tab
      JavascriptExecutor js = (JavascriptExecutor)driver;
      js.executeScript("window.open()");
     // switch to new tab
    // Switch to new window opened
      for(String winHandle : driver.getWindowHandles()){
          driver.switchTo().window(winHandle);
      }
     // navigate to chrome downloads
      driver.get("chrome://downloads");
      
      JavascriptExecutor js1 = (JavascriptExecutor)driver;
      // wait until the file is downloaded
      Long percentage = (long) 0;
      while ( percentage!= 100) {
          try {
              percentage = (Long) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value");
              //System.out.println(percentage);
          }catch (Exception e) {
            // Nothing to do just wait
        }
          Thread.sleep(1000);
      }
     // get the latest downloaded file name
      String fileName = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').text");
     // get the latest downloaded file url
      String sourceURL = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').href");
      // file downloaded location
      String donwloadedAt = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div.is-active.focus-row-active #file-icon-wrapper img').src");
      System.out.println("Download deatils");
      System.out.println("File Name :-" + fileName);
      System.out.println("Donwloaded path :- " + donwloadedAt);
      System.out.println("Downloaded from url :- " + sourceURL);
     // print the details
      System.out.println(fileName);
      System.out.println(sourceURL);
     // close the downloads tab2
      driver.close();
     // switch back to main window
      driver.switchTo().window(mainWindow);
      return fileName;
  }

This is how to call this in your java script.

// download triggering step 
downloadExe.click();
// now waituntil download finish and then get file name
System.out.println(waitUntilDonwloadCompleted(driver));

Output:

Download deatils

File Name :-RubyMine-2019.1.2 (7).exe

Donwloaded path :- chrome://fileicon/C%3A%5CUsers%5Csupputuri%5CDownloads%5CRubyMine-2019.1.2%20(7).exe?scale=1.25x

Downloaded from url :- https://download-cf.jetbrains.com/ruby/RubyMine-2019.1.2.exe

RubyMine-2019.1.2 (7).exe

You cannot specify name of download file through selenium. However, you can download the file, find the latest file in the downloaded folder, and rename as you want.

Note: borrowed methods from google searches may have errors. but you get the idea.

import os
import shutil
filename = max([Initial_path + "\\" + f for f in os.listdir(Initial_path)],key=os.path.getctime)
shutil.move(filename,os.path.join(Initial_path,r"newfilename.ext"))

Hope this snippet is not that confusing. It took me a while to create this and is really useful, because there has not been a clear answer to this problem, with just this library.

import os
import time
def tiny_file_rename(newname, folder_of_download):
    filename = max([f for f in os.listdir(folder_of_download)], key=lambda xa :   os.path.getctime(os.path.join(folder_of_download,xa)))
    if '.part' in filename:
        time.sleep(1)
        os.rename(os.path.join(folder_of_download, filename), os.path.join(folder_of_download, newname))
    else:
        os.rename(os.path.join(folder_of_download, filename),os.path.join(folder_of_download,newname))

Hope this saves someone's day, cheers.

EDIT: Thanks to @Om Prakash editing my code, it made me remember that I didn't explain the code thoughly.

Using the max([]) function could lead to a race condition, leaving you with empty or corrupted file(I know it from experience). You want to check if the file is completely downloaded in the first place. This is due to the fact that selenium don't wait for the file download to complete, so when you check for the last created file, an incomplete file will show up on your generated list and it will try to move that file. And even then, you are better off waiting a little bit for the file to be free from Firefox.

EDIT 2: More Code

I was asked if 1 second was enough time and mostly it is, but in case you need to wait more than that you could change the above code to this:

import os
import time
def tiny_file_rename(newname, folder_of_download, time_to_wait=60):
    time_counter = 0
    filename = max([f for f in os.listdir(folder_of_download)], key=lambda xa :   os.path.getctime(os.path.join(folder_of_download,xa)))
    while '.part' in filename:
        time.sleep(1)
        time_counter += 1
        if time_counter > time_to_wait:
            raise Exception('Waited too long for file to download')
    filename = max([f for f in os.listdir(folder_of_download)], key=lambda xa :   os.path.getctime(os.path.join(folder_of_download,xa)))
    os.rename(os.path.join(folder_of_download, filename), os.path.join(folder_of_download, newname))

There is something i would correct for @parishodak answer:

the filename here will only return the relative path (here the name of the file) not the absolute path.

That is why @FreshRamen got the following error after:

File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/‌python2.7/genericpath.py", 
line 72, in getctime return os.stat(filename).st_ctime OSError: 
[Errno 2] No such file or directory: '.localized'

There is the correct code:

import os
import shutil

filepath = 'c:\downloads'
filename = max([filepath +"\"+ f for f in os.listdir(filepath)], key=os.path.getctime)
shutil.move(os.path.join(dirpath,filename),newfilename)

I've come up with a different solution. Since you only care about the last downloaded file, then why not download it into a dummy_dir? So that, that file is going to be the only file in that directory. Once it's downloaded, you can move it to your destination_dir as well as changing it's name.

Here is an example that works with Firefox:

def rename_last_downloaded_file(dummy_dir, destination_dir, new_file_name):
    def get_last_downloaded_file_path(dummy_dir):
        """ Return the last modified -in this case last downloaded- file path.

            This function is going to loop as long as the directory is empty.
        """
        while not os.listdir(dummy_dir):
            time.sleep(1)
        return max([os.path.join(dummy_dir, f) for f in os.listdir(dummy_dir)], key=os.path.getctime)

    while '.part' in get_last_downloaded_file_path(dummy_dir):
        time.sleep(1)
    shutil.move(get_last_downloaded_file_path(dummy_dir), os.path.join(destination_dir, new_file_name))

You can fiddle with the sleep time and add a TimeoutException as well, as you see fit.

Here is the code sample I used to download pdf with a specific file name. First you need to configure chrome webdriver with required options. Then after clicking the button (to open pdf popup window), call a function to wait for download to finish and rename the downloaded file.

import os
import time
import shutil

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait

# function to wait for download to finish and then rename the latest downloaded file
def wait_for_download_and_rename(newFilename):
    # function to wait for all chrome downloads to finish
    def chrome_downloads(drv):
        if not "chrome://downloads" in drv.current_url: # if 'chrome downloads' is not current tab
            drv.execute_script("window.open('');") # open a new tab
            drv.switch_to.window(driver.window_handles[1]) # switch to the new tab
            drv.get("chrome://downloads/") # navigate to chrome downloads
        return drv.execute_script("""
            return document.querySelector('downloads-manager')
            .shadowRoot.querySelector('#downloadsList')
            .items.filter(e => e.state === 'COMPLETE')
            .map(e => e.filePath || e.file_path || e.fileUrl || e.file_url);
            """)
    # wait for all the downloads to be completed
    dld_file_paths = WebDriverWait(driver, 120, 1).until(chrome_downloads) # returns list of downloaded file paths
    # Close the current tab (chrome downloads)
    if "chrome://downloads" in driver.current_url:
        driver.close()
    # Switch back to original tab
    driver.switch_to.window(driver.window_handles[0]) 
    # get latest downloaded file name and path
    dlFilename = dld_file_paths[0] # latest downloaded file from the list
    # wait till downloaded file appears in download directory
    time_to_wait = 20 # adjust timeout as per your needs
    time_counter = 0
    while not os.path.isfile(dlFilename):
        time.sleep(1)
        time_counter += 1
        if time_counter > time_to_wait:
            break
    # rename the downloaded file
    shutil.move(dlFilename, os.path.join(download_dir,newFilename))
    return

# specify custom download directory
download_dir = r'c:\Downloads\pdf_reports'

# for configuring chrome pdf viewer for downloading pdf popup reports
chrome_options = webdriver.ChromeOptions()
chrome_options.add_experimental_option('prefs', {
    "download.default_directory": download_dir, # Set own Download path
    "download.prompt_for_download": False, # Do not ask for download at runtime
    "download.directory_upgrade": True, # Also needed to suppress download prompt
    "plugins.plugins_disabled": ["Chrome PDF Viewer"], # Disable this plugin
    "plugins.always_open_pdf_externally": True, # Enable this plugin
    })

# get webdriver with options for configuring chrome pdf viewer
driver = webdriver.Chrome(options = chrome_options)

# open desired webpage
driver.get('https://mywebsite.com/mywebpage')

# click the button to open pdf popup
driver.find_element_by_id('someid').click()

# call the function to wait for download to finish and rename the downloaded file
wait_for_download_and_rename('My file.pdf')

# close the browser windows
driver.quit()

Set timeout (120) to the wait time as per your needs.

Related questions
                            
                                Turn off caching of static files in Django development server
                            
                                How to install matplotlib with Python3.2
                            
                                sorting a counter in python by keys
                            
                                Insert a link inside a Pandas table
                            
                                Get the description of a status code in Python Requests
                            
                                Idiomatic way to do list/dict in Cython?
                            
                                How to store the result of an executed shell command in a variable in python? [duplicate]
                            
                                Difference between "findAll" and "find_all" in BeautifulSoup
                            
                                Python/PIL Resize all images in a folder
                            
                                Filter out rows based on list of strings in Pandas
                            
                                Add Multiple Columns to Pandas Dataframe from Function
                            
                                How can I remove all non-numeric characters from all the values in a particular column in pandas dataframe?
                            
                                How do I to flush redis db from python redis?
                            
                                How to check if folder is empty with Python?
                            
                                How to get the value of a Django Model Field object
                            
                                Command-line options to IPython *scripts*?
                            
                                How to convert nested list of lists into a list of tuples in python 3.3?
                            
                                How does django know which migrations have been run?
                            
                                How can I get terminal output in python? [duplicate]
                            
                                Setting the fmt option in numpy.savetxt

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Selenium give file name when downloading

Tags:

python

file

download

selenium

People also ask

Recent Activity

Donate For Us