I am about to write some computationally-intensive Python code that'll almost certainly spend most of its time inside numpy
's linear algebra functions.
The problem at hand is embarrassingly parallel. Long story short, the easiest way for me to take advantage of that would be by using multiple threads. The main barrier is almost certainly going to be the Global Interpreter Lock (GIL).
To help design this, it would be useful to have a mental model for which numpy
operations can be expected to release the GIL for their duration. To this end, I'd appreciate any rules of thumb, dos and don'ts, pointers etc.
In case it matters, I'm using 64-bit Python 2.7.1 on Linux, with numpy
1.5.1 and scipy
0.9.0rc2, built with Intel MKL 10.3.1.
The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter. This means that only one thread can be in a state of execution at any point in time.
Quite some numpy routines release GIL, so they can be efficiently parallel in threads (info). Maybe you don't need to do anything special! You can use this question to find whether the routines you need are among the ones that release GIL.
A global interpreter lock (GIL) is a mechanism used in computer-language interpreters to synchronize the execution of threads so that only one native thread (per process) can execute at a time. An interpreter that uses GIL always allows exactly one thread to execute at a time, even if run on a multi-core processor.
Don't expect Python 3.11 to drop the GIL just yet. Merging Sam's work back to CPython will itself be a laborious process, but is only part of what's needed: a very good backwards compatibility and migration plan for the community is needed before CPython drops the GIL. None of this is planned yet.
Quite some numpy routines release GIL, so they can be efficiently parallel in threads (info). Maybe you don't need to do anything special!
You can use this question to find whether the routines you need are among the ones that release GIL. In short, search for ALLOW_THREADS
or nogil
in the source.
(Also note that MKL has the ability to use multiple threads for a routine, so that's another easy way to get parallelism, although possibly not the fastest kind).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With