Combining asyncio with a multi-worker ProcessPoolExecutor

Tags:

python-asyncio

Is it possible to take a blocking function such as work and have it run concurrently in a ProcessPoolExecutor that has more than one worker?

import asyncio
from time import sleep, time
from concurrent.futures import ProcessPoolExecutor

num_jobs = 4
queue = asyncio.Queue()
executor = ProcessPoolExecutor(max_workers=num_jobs)
loop = asyncio.get_event_loop()

def work():
    sleep(1)

async def producer():
    for i in range(num_jobs):
        results = await loop.run_in_executor(executor, work)
        await queue.put(results)

async def consumer():
    completed = 0
    while completed < num_jobs:
        job = await queue.get()
        completed += 1

s = time()
loop.run_until_complete(asyncio.gather(producer(), consumer()))
print("duration", time() - s)

Running the above on a machine with more than 4 cores takes ~4 seconds. How would you write producer such that the above example takes only ~1 second?

632

asked Jun 30 '18 17:06

2 Answers

The problem is in the producer. Instead of allowing the jobs to run in the background, it waits for each job to finish, thus serializing them. If you rewrite producer to look like this (and leave consumer unchanged), you get the expected 1s duration:

async def producer():
    for i in range(num_jobs):
        fut = loop.run_in_executor(executor, work)
        fut.add_done_callback(lambda f: queue.put_nowait(f.result()))

106

answered Oct 19 '22 13:10

user4815162342

await loop.run_in_executor(executor, work) blocks the loop until work completes, as a result you only have one function running at a time.

To run jobs concurrently, you could use asyncio.as_completed:

async def producer():
    tasks = [loop.run_in_executor(executor, work) for _ in range(num_jobs)]
    for f in asyncio.as_completed(tasks, loop=loop):
        results = await f
        await queue.put(results)

answered Oct 19 '22 12:10

vaultah

Related questions
                            
                                How to copy worksheet from one workbook to another one using openpyxl?
                            
                                How to easily avoid Tkinter freezing?
                            
                                Barplot colored according a colormap?
                            
                                Running `airflow scheduler` launches 33 scheduler processes
                            
                                How to do from module import * using importlib? [duplicate]
                            
                                ImageGrab alternative in linux
                            
                                How do I run a dask.distributed cluster in a single thread?
                            
                                When to use Absolute Path vs Relative Path in Python
                            
                                Django ManyToMany through with multiple databases
                            
                                Get the types of the keys in a dictionary
                            
                                Convert dict of nested lists to list of tuples
                            
                                What is the type hint for a function [duplicate]
                            
                                Dataframe transpose with pyspark in Apache Spark
                            
                                What is the purpose of adding to INSTALLED_APPS in Django?
                            
                                Unsupported media type Django API
                            
                                What is the difference between python native data structure "DICTIONARY" and "Redis" database?
                            
                                Where does pip download .whl files?
                            
                                Given input size: (128x1x1). Calculated output size: (128x0x0). Output size is too small
                            
                                What does train_on_batch() do in keras model?
                            
                                sudo privileges within python virtualenv

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Combining asyncio with a multi-worker ProcessPoolExecutor

Tags:

python

python-asyncio

Chris Seymour

People also ask

2 Answers

user4815162342

vaultah

Recent Activity

Donate For Us