I have the following code:
def process_url(url):
print '111'
r = requests.get(url)
print '222' # <-- never even gets here
return
urls_to_download = [list_or_urls]
PARALLEL_WORKERS = 4
pool = Pool(PARALLEL_WORKERS)
pool.map_async(process_url, urls_to_download)
pool.close()
pool.join()
Every time I do this, it runs the first four items and then just hangs. I don't think it's a timeout issue, as it is extremely fast to download the four urls. It is just after fetching those first four it hangs indefinitely.
What do I need to do to remedy this?
Even though this question uses python 2, you can still reproduce this "error" in python 3. This is happening because pool.async_map
returns an object of class AsyncResult. To receive the result (or traceback in case of error) of the async_map
call, you need to use get()
. Joining the pool will not work here since the job has already been completed, with the result being an AsyncResult
which acts similar to a Promise.
Simply, add a call to wait for the result to be received:
from multiprocessing import Pool
import requests
def process_url(url):
print('111')
r = requests.get(url)
print('222') # <-- never even gets here (not anymore!)
return
if __name__ == "__main__":
urls_to_download = ['https://google.com'] * 4
PARALLEL_WORKERS = 4
pool = Pool(PARALLEL_WORKERS)
a = pool.map_async(process_url, urls_to_download)
# Add call here
a.get()
pool.close()
pool.join()
Output
111
111
111
111
222
222
222
222
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With