Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to release the GIL for pure functions using pure python?

I think I must be missing something; this seems so right, but I can't see a way to do this.

Say you have a pure function in Python:

from math import sin, cos

def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
    return (x, y)

is there some built-in functionality or library that provides a wrapper of some sort that can release the GIL during the function's execution?

In my mind I am thinking of something along the lines of

from math import sin, cos
from somelib import pure

@pure
def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
    return (x, y)

Why do I think this might be useful?

Because multithreading, which is currently only attractive for I/O-bound programs, would become attractive for such functions once they become long-running. Doing something like

from math import sin, cos
from somelib import pure
from asyncio import run, gather, create_task

@pure  # releases GIL for f
async def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2 * t) - 2 * cos(3 * t) - cos(4 * t)
    return (x, y)


async def main():
    step_size = 0.1
    result = await gather(*[create_task(f(t / step_size))
                            for t in range(0, round(10 / step_size))])
    return result

if __name__ == "__main__":
    results = run(main())
    print(results)

Of course, multiprocessing offers Pool.map which can do something very similar. However, if the function returns a non-primitive / complex type then the worker has to serialize it and the main process HAS to deserialize and create a new object, creating a necessary copy. With threads, the child thread passes a pointer and the main thread simply takes ownership of the object. Much faster (and cleaner?).

To tie this to a practical problem I encountered a few weeks ago: I was doing a reinforcement learning project, which involved building an AI for a chess-like game. For this, I was simulating the AI playing against itself for > 100,000 games; each time returning the resulting sequence of board states (a numpy array). Generating these games runs in a loop, and I use this data to create a stronger version of the AI each time. Here, re-creating ("malloc") the sequence of states for each game in the main process was the bottleneck. I experimented with re-using existing objects, which is a bad idea for many reasons, but that didn't yield much improvement.

Edit: This question differs from How to run functions in parallel? , because I am not just looking for any way to run code in parallel (I know this can be achieved in various ways, e.g. via multiprocessing). I am looking for a way to let the interpreter know that nothing bad will happen when this function gets executed in a parallel thread.

like image 641
FirefoxMetzger Avatar asked Dec 04 '20 08:12

FirefoxMetzger


People also ask

Is there a way to run Python code without Gil?

There is a way to overcome the limitation imposed by the GIL for CPU bound threads though: using Cython. Cython offers a wonderful context manager to run instructions without the GIL: with nogil. The catch is that it can only be used to run code that does not touch any Python object.

What is the use of Gil in Python?

When Python creates a thread it calls the take_gil function before entering the execution_loop. Basically, the job of the GIL is to pause the while loop for all threads except for a thread that currently owns the GIL. For example, if you have three threads, two of them will be suspended.

What is a pure function?

The basic definition of a pure function is a function that doesn't cause or rely on side effects. The output of a pure function should only depend on its inputs. There are two basic ways a function can cause side effects that directly affect other parts of the code. The first is by reading or writing global variables.

How to release the Gil on demand in Python?

- In the GIL-holding thread, the main loop (PyEval_EvalFrameEx) must be able to release the GIL on demand by another thread. A volatile boolean variable (gil_drop_request) is used for that purpose, which is checked at every turn of the eval loop. That variable is set after a wait of `interval` microseconds on `gil_cond` has timed out.


Video Answer


1 Answers

Is there a way to release the GIL for pure functions using pure python?

In short, the answer is no, because those functions aren't pure on the level on which the GIL operates.

GIL serves not just to protect objects from being updated concurrently by Python code, its primary purpose is to prevent the interpreter from performing a data race (which is undefined behavior, i.e. forbidden in the C memory model) while accessing and updating global and shared data. This includes Python-visible singletons such as None, True, and False, but also all globals like modules, shared dicts, and caches. Then there is their metadata such as reference counts and type objects, as well as shared data used internally by the implementation.

Consider the provided pure function:

def f(t):
    x = 16 * sin(t) ** 3
    y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
    return (x, y)

The dis tool reveals the operations that the interpreter performs when executing the function:

>>> dis.dis(f)
  2           0 LOAD_CONST               1 (16)
              2 LOAD_GLOBAL              0 (sin)
              4 LOAD_FAST                0 (t)
              6 CALL_FUNCTION            1
              8 LOAD_CONST               2 (3)
             10 BINARY_POWER
             12 BINARY_MULTIPLY
             14 STORE_FAST               1 (x)
             ...

To run the code, the interpreter must access the global symbols sin and cos in order to call them. It accesses the integers 2, 3, 4, 5, 13, and 16, which are all cached and therefore also global. In case of an error, it looks up the exception classes in order to instantiate the appropriate exceptions. Even when these global accesses don't modify the objects, they still involve writes because they must update the reference counts.

None of that can be done safely from multiple threads without synchronization. While it is conceivably possible to modify the Python interpreter to implement truly pure functions that don't access global state, it would require significant modifications to the internals, affecting compatibility with existing C extensions, including the vastly popular scientific ones. This last point is the principal reason why removing the GIL has proven to be so difficult.

like image 198
user4815162342 Avatar answered Nov 08 '22 12:11

user4815162342