Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't store downloaded files in their concerning folders

I've written a script in python in combination with selenium to download few document files (ending with .doc) from a webpage. The reason I do not wish to use requests or urllib module to download the files is because the website I'm currently palying with do not have any true url connected to each file. They are javascript encrypted. However, I've chosen a link within my script to mimic the same.

What my script does at this moment:

  1. Create a master folder in the desktop
  2. Create subfolders within the master folder taking the name of the files to be downloaded
  3. Download files initiating click on their links and put the files in master folder. (this is what I need rectified)

How can I modify my script to download the files initiating click on their links and put the downloaded files in their concerning folders?

This is my try so far:

import os
import time
from selenium import webdriver

link ='https://www.online-convert.com/file-format/doc' 

dirf = os.path.expanduser('~')
desk_location = dirf + r'\Desktop\file_folder'
if not os.path.exists(desk_location):os.mkdir(desk_location)

def download_files():
    driver.get(link)
    for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
        filename = item.get_attribute("href").split("/")[-1]
        #creating new folder in accordance with filename to store the downloaded file in thier concerning folder
        folder_name = item.get_attribute("href").split("/")[-1].split(".")[0]
        #set the new location of the folders to be created
        new_location = os.path.join(desk_location,folder_name)
        if not os.path.exists(new_location):os.mkdir(new_location)
        #set the location of the folders the downloaded files will be within
        file_location = os.path.join(new_location,filename)
        item.click()

        time_to_wait = 10
        time_counter = 0
        try:
            while not os.path.exists(file_location):
                time.sleep(1)
                time_counter += 1
                if time_counter > time_to_wait:break
        except Exception:pass

if __name__ == '__main__':
    chromeOptions = webdriver.ChromeOptions()
    prefs = {'download.default_directory' : desk_location,
            'profile.default_content_setting_values.automatic_downloads': 1
        }
    chromeOptions.add_experimental_option('prefs', prefs)
    driver = webdriver.Chrome(chrome_options=chromeOptions)
    download_files()

The following image represents how the downloaded files are currently stored (the files are outside of their concerning folders):

enter image description here

like image 402
robots.txt Avatar asked Feb 11 '19 08:02

robots.txt


People also ask

Why is my Downloads folder not working?

Some of the most common reasons that may trigger this issue and the Downloads folder may stop responding on your device include: Too many files are stored in the Downloads folder that your computer may not be able to process in the accumulative time. File Explorer issues or errors. Corrupt system files.

Why are my Downloads not showing in my Downloads folder?

Fix Downloads Not Showing up on Windows 10 You can click "Show in folder" to check the accurate save location. To change the default storage location, go to "Settings" > "Downloads" > "Location" > click "Change" to complete. The approach is similar to change the location of files downloaded by other browsers.

How do I change my download settings?

Tap the menu on the left side and select "Settings." Navigate to "User Controls" and then again to "Content Filtering." A list of options will generate for downloads and you can select "Wi-Fi only" to save your mobile data and prevent automatic downloads and updates from running without a Wi-Fi connection.


1 Answers

I just added the the rename of the file to move it. So it'll work just as you have it, but then once it downloads the file, will move it to the correct path:

os.rename(desk_location + '\\' + filename, file_location)

Full Code:

import os
import time
from selenium import webdriver

link ='https://www.online-convert.com/file-format/doc' 

dirf = os.path.expanduser('~')
desk_location = dirf + r'\Desktop\file_folder'
if not os.path.exists(desk_location):
    os.mkdir(desk_location)

def download_files():
    driver.get(link)
    for item in driver.find_elements_by_css_selector("a[href$='.doc']")[:2]:
        filename = item.get_attribute("href").split("/")[-1]
        #creating new folder in accordance with filename to store the downloaded file in thier concerning folder
        folder_name = item.get_attribute("href").split("/")[-1].split(".")[0]
        #set the new location of the folders to be created
        new_location = os.path.join(desk_location,folder_name)
        if not os.path.exists(new_location):
            os.mkdir(new_location)
        #set the location of the folders the downloaded files will be within
        file_location = os.path.join(new_location,filename)
        item.click()

        time_to_wait = 10
        time_counter = 0

        try:
            while not os.path.exists(file_location):
                time.sleep(1)
                time_counter += 1
                if time_counter > time_to_wait:break
            os.rename(desk_location + '\\' + filename, file_location)
        except Exception:pass

if __name__ == '__main__':
    chromeOptions = webdriver.ChromeOptions()
    prefs = {'download.default_directory' : desk_location,
            'profile.default_content_setting_values.automatic_downloads': 1
        }
    chromeOptions.add_experimental_option('prefs', prefs)
    driver = webdriver.Chrome(chrome_options=chromeOptions)
    download_files()
like image 112
chitown88 Avatar answered Oct 01 '22 17:10

chitown88