Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Download file from URL and save it in a folder Python

I've a lot of URL with file types .docx and .pdf I want to run a python script that downloads them from the URL and saves it in a folder. Here is what I've done for a single file I'll add them to a for loop:

response = requests.get('http://wbesite.com/Motivation-Letter.docx')
with open("my_file.docx", 'wb') as f:
    f.write(response.content)

but the my_file.docx that it is saving is only 266 bytes and is corrupt but the URL is fine.

UPDATE:

Added this code and it works but I want to save it in a new folder.

import os
import shutil
import requests

def download_file(url, folder_name):
    local_filename = url.split('/')[-1]
    path = os.path.join("/{}/{}".format(folder_name, local_filename))
    with requests.get(url, stream=True) as r:
        with open(path, 'wb') as f:
            shutil.copyfileobj(r.raw, f)

    return local_filename
like image 341
Chaudhry Talha Avatar asked Jul 09 '19 10:07

Chaudhry Talha


Video Answer


2 Answers

Try using stream option:

import os
import requests


def download(url: str, dest_folder: str):
    if not os.path.exists(dest_folder):
        os.makedirs(dest_folder)  # create folder if it does not exist

    filename = url.split('/')[-1].replace(" ", "_")  # be careful with file names
    file_path = os.path.join(dest_folder, filename)

    r = requests.get(url, stream=True)
    if r.ok:
        print("saving to", os.path.abspath(file_path))
        with open(file_path, 'wb') as f:
            for chunk in r.iter_content(chunk_size=1024 * 8):
                if chunk:
                    f.write(chunk)
                    f.flush()
                    os.fsync(f.fileno())
    else:  # HTTP status code 4XX/5XX
        print("Download failed: status code {}\n{}".format(r.status_code, r.text))


download("http://website.com/Motivation-Letter.docx", dest_folder="mydir")

Note that mydir in example above is the name of folder in current working directory. If mydir does not exist script will create it in current working directory and save file in it. Your user must have permissions to create directories and files in current working directory.

You can pass an absolute file path in dest_folder, but check permissions first.

P.S.: avoid asking multiple questions in one post

like image 68
Ivan Vinogradov Avatar answered Sep 16 '22 13:09

Ivan Vinogradov


try:

import urllib.request 
urllib.request.urlretrieve(url, filename)
like image 28
ncica Avatar answered Sep 16 '22 13:09

ncica