I have the following code:
def process_url(url):
print '111'
r = requests.get(url)
print '222' # <-- never even gets here
return
urls_to_download = [list_or_urls]
PARALLEL_WORKERS = 4
pool = Pool(PARALLEL_WORKERS)
pool.map_async(process_url, urls_to_download)
pool.close()
pool.join()
Every time I do this, it runs the first four items and then just hangs. I don't think it's a timeout issue, as it is extremely fast to download the four urls. It is just after fetching those first four it hangs indefinitely.
What do I need to do to remedy this?
Even though this question uses python 2, you can still reproduce this "error" in python 3. This is happening because pool.async_map returns an object of class AsyncResult. To receive the result (or traceback in case of error) of the async_map call, you need to use get(). Joining the pool will not work here since the job has already been completed, with the result being an AsyncResult which acts similar to a Promise.
Simply, add a call to wait for the result to be received:
from multiprocessing import Pool
import requests
def process_url(url):
print('111')
r = requests.get(url)
print('222') # <-- never even gets here (not anymore!)
return
if __name__ == "__main__":
urls_to_download = ['https://google.com'] * 4
PARALLEL_WORKERS = 4
pool = Pool(PARALLEL_WORKERS)
a = pool.map_async(process_url, urls_to_download)
# Add call here
a.get()
pool.close()
pool.join()
Output
111
111
111
111
222
222
222
222
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With