Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is asyncio.run_in_executor specified ambiguously?

I have a server application and when requested by the client I schedule some work, like

def work():
    time.sleep(5)

fut = asyncio.get_event_loop().run_in_executor(None, work)

I await fut later when it is requested explicitly. My use case requires that run_in_executor submit the work function immediately, and that behaves as expected in my environment (Ubuntu 16.04, Python 3.7.1).

Since my application depends on this behavior I wanted to verify that it is not something likely to change, so I checked several resources:

  1. The documentation seems kind of vague. awaitable seems like it may apply to the method or the return value - though the body of the text does say it returns an asyncio.Future explicitly.
  2. PEP 3156 that specifies asyncio - here it says nothing close to run_in_executor being a coroutine.
  3. In a few issues whether run_in_executor is a function that returns an awaitable or a coroutine itself seems to be considered an implementation detail. See 25675 and 32327.
  4. AbstractEventLoop.run_in_executor is specified as a coroutine, but the implementation in BaseEventLoop.run_in_executor is a plain function.

1 and 2 mostly seem to indicate that the current behavior is correct, but 3 and 4 are concerning. This seems like a very important part of the interface because if the function itself is a coroutine then it will not begin executing (therefore will not schedule the work) until it is awaited.

Is it safe to rely on the current behavior? If so, is it reasonable to change the interface of AbstractEventLoop.run_in_executor to a plain function instead of a coroutine?

like image 384
Chris Hunt Avatar asked Jan 19 '19 02:01

Chris Hunt


People also ask

What is Run_in_executor?

run_in_executor is used to manage threads from within an event loop. To this end, it needs to wrap the thread into a Future, which needs to be assigned to an event loop (in one way or another). The reason the method is stored directly in a loop object is probably historical. It might as well have been asyncio.

How many times should Asyncio run () be called?

It should be used as a main entry point for asyncio programs, and should ideally only be called once. New in version 3.7.

What is an event loop in Asyncio?

The event loop is the core of every asyncio application. Event loops run asynchronous tasks and callbacks, perform network IO operations, and run subprocesses. Application developers should typically use the high-level asyncio functions, such as asyncio.

How do I close Asyncio loop in Python?

Run an asyncio Event Loop run_until_complete(<some Future object>) – this function runs a given Future object, usually a coroutine defined by the async / await pattern, until it's complete. run_forever() – this function runs the loop forever. stop() – the stop function stops a running loop.


1 Answers

My use case requires that run_in_executor submit the work function immediately, and that behaves as expected in my environment

The current behavior is not guaranteed by the documentation, which only specifies that the function arranges for func to be called, and that it returns an awaitable. If it were implemented with a coroutine, it would not submit until run by the event loop.

However, this behavior was present since the beginning and it is extremely unlikely to change in the future. Delaying submitting, though technically allowed by the docs, would break many real-world asyncio applications and constitute a serious backwards-incompatible change.

If you wanted to ensure that the task starts without depending on undocumented behavior, you could create your own function equivalent to run_in_executor. It really boils down to combining executor.submit and asyncio.wrap_future. Without frills, it could be as simple as:

def my_run_in_executor(executor, f, *args):
    return asyncio.wrap_future(executor.submit(f, *args))

Because executor.submit is called directly in the function, this version guarantees that the worker function is started without waiting for the event loop to run.

PEP 3156 explicitly states that run_in_executor is "equivalent to wrap_future(executor.submit(callback, *args))", thus providing the needed guarantee - but the PEP is not the official documentation, and the final implementation and specification often diverge from the initial PEP.

If one insisted on sticking to the documented interface of run_in_executor, it is also possible to use explicit synchronization to force the coroutine to wait for the worker to start:

async def run_now(f, *args):
    loop = asyncio.get_event_loop()
    started = asyncio.Event()
    def wrapped_f():
        loop.call_soon_threadsafe(started.set)
        return f(*args)
    fut = loop.run_in_executor(None, wrapped_f)
    await started.wait()
    return fut

fut = await run_now(work)
# here the worker has started, but not (necessarily) finished
result = await fut
# here the worker has finished and we have its return value

This approach introduces unnecessary implementation and interface complexity, particularly jarring being the need to use await to obtain a future, which runs counter to how asyncio normally works. run_now is only included for completeness and I would not recommend using it in production.

like image 115
user4815162342 Avatar answered Oct 25 '22 19:10

user4815162342