I'm working on a script to download a group of files. I successfully completed this and it's working decently. Now I have tried adding a dynamic printout of the progression of the download.
For small downloads (it's of .mp4 files by the way) such as 5MB, the progression works great and the file closes successfully, thus resulting in a complete and working downloaded .mp4 file. For larger files, such as 250MB and above, it does not work successfully, I get the following error:
And here's my code:
import urllib.request
import shutil
import os
import sys
import io
script_dir = os.path.dirname('C:/Users/Kenny/Desktop/')
rel_path = 'stupid_folder/video.mp4'
abs_file_path = os.path.join(script_dir, rel_path)
url = 'https://archive.org/download/SF145/SF145_512kb.mp4'
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(abs_file_path, 'wb') as out_file:
eventID = 123456
resp = urllib.request.urlopen(url)
length = resp.getheader('content-length')
if length:
length = int(length)
blocksize = max(4096, length//100)
else:
blocksize = 1000000 # just made something up
# print(length, blocksize)
buf = io.BytesIO()
size = 0
while True:
buf1 = resp.read(blocksize)
if not buf1:
break
buf.write(buf1)
size += len(buf1)
if length:
print('\r[{:.1f}%] Downloading: {}'.format(size/length*100, eventID), end='')#print('\rDownloading: {:.1f}%'.format(size/length*100), end='')
print()
shutil.copyfileobj(response, out_file)
This works perfectly with small files, but larger ones I get the error. Now, I do NOT get the error, however, with larger files if I comment out the progress indicator code:
with urllib.request.urlopen(url) as response, open(abs_file_path, 'wb') as out_file:
# eventID = 123456
#
# resp = urllib.request.urlopen(url)
# length = resp.getheader('content-length')
# if length:
# length = int(length)
# blocksize = max(4096, length//100)
# else:
# blocksize = 1000000 # just made something up
#
# # print(length, blocksize)
#
# buf = io.BytesIO()
# size = 0
# while True:
# buf1 = resp.read(blocksize)
# if not buf1:
# break
# buf.write(buf1)
# size += len(buf1)
# if length:
# print('\r[{:.1f}%] Downloading: {}'.format(size/length*100, eventID), end='')#print('\rDownloading: {:.1f}%'.format(size/length*100), end='')
# print()
shutil.copyfileobj(response, out_file)
Does anyone have any ideas? This is the last part of my project and I would really like to be able to see the progress. Once again, this is Python 3.5. Thanks for any help provided!
You're opening your url twice, once as response
and once as resp
. With your progress bar stuff, you're consuming the data, so when the file is copied using copyfileobj
, the data is empty (well maybe that is inaccurate as it works for small files, but you are doing things twice here and it is probably the orgin of your problem)
To get progress bar AND valid file do this:
with urllib.request.urlopen(url) as response, open(abs_file_path, 'wb') as out_file:
eventID = 123456
length = response.getheader('content-length')
if length:
length = int(length)
blocksize = max(4096, length//100)
else:
blocksize = 1000000 # just made something up
size = 0
while True:
buf1 = response.read(blocksize)
if not buf1:
break
out_file.write(buf1)
size += len(buf1)
if length:
print('\r[{:.1f}%] Downloading: {}'.format(size/length*100, eventID), end='')#print('\rDownloading: {:.1f}%'.format(size/length*100), end='')
print()
Simplifications done to your code:
urlopen
, as response
BytesIO
, directly write to out_file
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With