Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use progressbar module with urlretrieve

My pyhton3 script downloads a number of images over the internet using urlretrieve, and I'd like to add a progressbar with a completed percentage and download speed for each download.

The progressbar module seems like a good solution, but although I've looked through their examples, and example4 seems like the right thing, I still can't understand how to wrap it around the urlretrieve.

I guess I should add a third parameter:

urllib.request.urlretrieve('img_url', 'img_filename', some_progressbar_based_reporthook)

But how do I properly define it?

like image 594
Vasily Avatar asked Jun 10 '16 12:06

Vasily


3 Answers

I think a better solution is to create a class that has all the needed state

import progressbar

class MyProgressBar():
    def __init__(self):
        self.pbar = None

    def __call__(self, block_num, block_size, total_size):
        if not self.pbar:
            self.pbar=progressbar.ProgressBar(maxval=total_size)
            self.pbar.start()

        downloaded = block_num * block_size
        if downloaded < total_size:
            self.pbar.update(downloaded)
        else:
            self.pbar.finish()

and call :

urllib.request.urlretrieve('img_url', 'img_filename', MyProgressBar())
like image 59
George C Avatar answered Nov 07 '22 22:11

George C


The suggestion in the other answer did not progress for me past 1%. Here is a complete implementation that works for me on Python 3:

import progressbar
import urllib.request


pbar = None


def show_progress(block_num, block_size, total_size):
    global pbar
    if pbar is None:
        pbar = progressbar.ProgressBar(maxval=total_size)
        pbar.start()

    downloaded = block_num * block_size
    if downloaded < total_size:
        pbar.update(downloaded)
    else:
        pbar.finish()
        pbar = None


urllib.request.urlretrieve(model_url, model_file, show_progress)
like image 40
Nic Dahlquist Avatar answered Nov 08 '22 00:11

Nic Dahlquist


The hook is defined as:

urlretrieve(url[, filename[, reporthook[, data]]]) "The third argument, if present, is a hook function that will be called once on establishment of the network connection and once after each block read thereafter. The hook will be passed three arguments; a count of blocks transferred so far, a block size in bytes, and the total size of the file. The third argument may be -1 on older FTP servers which do not return a file size in response to a retrieval request. "

So, you can write a hook as follows:

# Global variables
pbar = None
downloaded = 0

def show_progress(count, block_size, total_size):
    if pbar is None:
        pbar = ProgressBar(maxval=total_size)

    downloaded += block_size
    pbar.update(block_size)
    if downloaded == total_size:
        pbar.finish()
        pbar = None
        downloaded = 0

As a side note I strongly recommend you to use requests library which is a lot easier to use and you can iterate over the response with the iter_content() method.

like image 43
Doron Cohen Avatar answered Nov 07 '22 22:11

Doron Cohen