I am about to write some computationally-intensive Python code that'll almost certainly spend most of its time inside <code>numpy</code>'s linear algebra functions. The problem at hand is embarrassingly parallel. Long story short, the easiest way for me to take advantage of that would be by using multiple threads. The main barrier is almost certainly going to be the Global Interpreter Lock (GIL). To help design this, it would be useful to have a mental model for which <code>numpy</code> operations can be expected to release the GIL for their duration. To this end, I'd appreciate any rules of thumb, dos and don'ts, pointers etc. In case it matters, I'm using 64-bit Python 2.7.1 on Linux, with <code>numpy</code> 1.5.1 and <code>scipy</code> 0.9.0rc2, built with Intel MKL 10.3.1.

Quite some numpy routines release GIL, so they can be efficiently parallel in threads (info). Maybe you don't need to do anything special! You can use this question to find whether the routines you need are among the ones that release GIL. In short, search for <code>ALLOW_THREADS</code> or <code>nogil</code> in the source. (Also note that MKL has the ability to use multiple threads for a routine, so that's another easy way to get parallelism, although possibly not the fastest kind).

numpy and Global Interpreter Lock

Tags:

python

multithreading

python-multithreading

numpy

gil

I am about to write some computationally-intensive Python code that'll almost certainly spend most of its time inside numpy's linear algebra functions.

The problem at hand is embarrassingly parallel. Long story short, the easiest way for me to take advantage of that would be by using multiple threads. The main barrier is almost certainly going to be the Global Interpreter Lock (GIL).

To help design this, it would be useful to have a mental model for which numpy operations can be expected to release the GIL for their duration. To this end, I'd appreciate any rules of thumb, dos and don'ts, pointers etc.

In case it matters, I'm using 64-bit Python 2.7.1 on Linux, with numpy 1.5.1 and scipy 0.9.0rc2, built with Intel MKL 10.3.1.

751

asked Jun 01 '11 11:06

NPE

1 Answers

Quite some numpy routines release GIL, so they can be efficiently parallel in threads (info). Maybe you don't need to do anything special!

You can use this question to find whether the routines you need are among the ones that release GIL. In short, search for ALLOW_THREADS or nogil in the source.

(Also note that MKL has the ability to use multiple threads for a routine, so that's another easy way to get parallelism, although possibly not the fastest kind).

198

answered Sep 25 '22 14:09

Mark

Related questions
                            
                                Paramiko "Unknown Server"
                            
                                Computing N Grams using Python
                            
                                Print series of prime numbers in python
                            
                                Finding matching keys in two large dictionaries and doing it fast
                            
                                list.append or list +=?
                            
                                Django migrate : doesn't create tables
                            
                                Generating unique, ordered Pythagorean triplets
                            
                                Django sort by distance
                            
                                Using Jython through IPython: is readline still an issue?
                            
                                Getting all task IDs from nested chains and chords
                            
                                Is `scipy.misc.comb` faster than an ad-hoc binomial computation?
                            
                                How to use Google Colaboratory server as python interpreter in Python IDE?
                            
                                Accessing stream output from hdfs of MRjob
                            
                                Description of TF Lite's Toco converter args for quantization aware training
                            
                                Embedding Python in MATLAB
                            
                                Multiple questions with Object-Oriented Bokeh [OBSOLETE]
                            
                                Loading SavedModel is a lot slower than loading a tf.train.Saver checkpoint
                            
                                How to re-order units based on their degree of desirable neighborhood ? (in Processing)
                            
                                Decorators on abstract methods
                            
                                Why Bother With Recurrent Neural Networks For Structured Data?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With