Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How would I track progress on a large batch of grequests?

I am sometimes sending a large amount of requests through Python's grequest.map function. Currently my code looks like below

# passes these two into the function. The list of parameters can sometimes be thousands long.
# made this example up
local_path = 'https://www.google.com/search?q={}'
parameters = [('the+answer+to+life+the+universe+and+everything'), ('askew'), ('fun+facts')]
    s = requests.Session()
    retries = Retry(total=5, backoff_factor=0.2, status_forcelist=[500,502,503,504], raise_on_redirect=True, raise_on_status=True)
    s.mount('http://', HTTPAdapter(max_retries=retries))
    s.mount('https://', HTTPAdapter(max_retries=retries))
    async_list = []
    for parameters in parameter_list:
        URL = local_path.format(*parameters)
        async_list.append(grequests.get(URL, session=s))
    results = grequests.map(async_list)

I am a fan of the tqdm library, and would love to have an indicator of progress on how many requests have completed and how many are still being waited on, but I'm not sure if it's possible to poll or generate a hook that is capable of doing this from the grequest.get or from the Session. I did try using grequests.get(URL, hooks={'response': test}, session=s) but this seemed to actually feed the response itself into the test function and then results had contents of None.

edit: shortly after I posted this question I explored the return values from a test hook function but whatever I try, it seems like if there is a hook then the map function does not block until it has responses; resulting in None responses and nothing coming from my hook either.

How would I track progress on a large amount of requests?

like image 324
DoubleDouble Avatar asked Nov 17 '25 11:11

DoubleDouble


1 Answers

Using the hooks parameter was the correct solution. I found that the test callback I had set up was encountering an exception (curse those tiny scoping errors) and since I do not have an exception handler setup for my requests it caused a silent error that resulted in None responses.

This is the setup I ended up with.

track_requests = None
def request_fulfilled(r, *args, **kwargs):
    track_requests.update()

local_path = 'https://www.google.com/search?q={}'
parameters = [('the+answer+to+life+the+universe+and+everything'), ('askew'), ('fun+facts')]
    global track_requests # missing this line was the cause of my issue...
    s = requests.Session()
    s.hooks['response'].append(request_fulfilled) # assign hook here
    retries = Retry(total=5, backoff_factor=0.2, status_forcelist=[500,502,503,504], raise_on_redirect=True, raise_on_status=True)
    s.mount('http://', HTTPAdapter(max_retries=retries))
    s.mount('https://', HTTPAdapter(max_retries=retries))
    async_list = []
    for parameters in parameter_list:
        URL = local_path.format(*parameters)
        async_list.append(grequests.get(URL, session=s))
    track_requests = tqdm(total=len(async_list))
    results = grequests.map(async_list)
    track_requests.close()
    track_requests = None
like image 141
DoubleDouble Avatar answered Nov 18 '25 23:11

DoubleDouble



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!