Why are numpy calculations not affected by the global interpreter lock?

Tags:

I'm trying to decide if I should use multiprocessing or threading, and I've learned some interesting bits about the Global Interpreter Lock. In this nice blog post, it seems multithreading isn't suitable for busy tasks. However, I also learned that some functionality, such as I/O or numpy, is unaffected by the GIL.

Can anyone explain why, and how I can find out if my (probably quite numpy-heavy) code is going to be suitable for multithreading?

476

asked Apr 07 '16 14:04

Lisa

1 Answers

Many numpy calculations are unaffected by the GIL, but not all.

While in code that does not require the Python interpreter (e.g. C libraries) it is possible to specifically release the GIL - allowing other code that depends on the interpreter to continue running. In the Numpy C codebase the macros NPY_BEGIN_THREADS and NPY_END_THREADS are used to delimit blocks of code that permit GIL release. You can see these in this search of the numpy source.

The NumPy C API documentation has more information on threading support. Note the additional macros NPY_BEGIN_THREADS_DESCR, NPY_END_THREADS_DESCR and NPY_BEGIN_THREADS_THRESHOLDED which handle conditional GIL release, dependent on array dtypes and the size of loops.

Most core functions release the GIL - for example Universal Functions (ufunc) do so as described:

as long as no object arrays are involved, the Python Global Interpreter Lock (GIL) is released prior to calling the loops. It is re-acquired if necessary to handle error conditions.

With regard to your own code, the source code for NumPy is available. Check the functions you use (and the functions they call) for the above macros. Note also that the performance benefit is heavily dependent on how long the GIL is released - if your code is constantly dropping in/out of Python you won't see much of an improvement.

The other option is to just test it. However, bear in mind that functions using the conditional GIL macros may exhibit different behaviour with small and large arrays. A test with a small dataset may therefore not be an accurate representation of performance for a larger task.

There is some additional information on parallel processing with numpy available on the official wiki and a useful post about the Python GIL in general over on Programmers.SE.

answered Sep 17 '22 01:09

mfitzp

Related questions
                            
                                Does Python intern strings?
                            
                                matplotlib advanced bar plot
                            
                                SSL error installing pycurl after SSL is set
                            
                                Django form: what is the best way to modify posted data before validating?
                            
                                Python - Decode UTF-16 file with BOM
                            
                                How to use full_clean() for data validation before saving in Django 1.5 gracefully?
                            
                                How can I make an animation with contourf()?
                            
                                How to limit choices of ForeignKey choices for Django raw_id_field
                            
                                Django - Multiple apps on one webpage?
                            
                                How can I identify requests made via AJAX in Python's Flask?
                            
                                Is it possible to print the decision tree in scikit-learn?
                            
                                Python regex for number with or without decimals using a dot or comma as separator?
                            
                                Is it possible to exclude certain files when building a wheel with setup.py?
                            
                                Python Storing Data
                            
                                Joining elements in a list without the join command
                            
                                Making Python run a few lines before my script
                            
                                Difference in complexity of append and concatenate for this list code?
                            
                                What's the purpose of Django "deconstruct" model field function?
                            
                                Configure a first cell by default in Jupyter notebooks
                            
                                How to check version of python package if no __version__ variable is set

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why are numpy calculations not affected by the global interpreter lock?

Tags:

python

multithreading

multiprocessing

numpy

gil

Lisa

People also ask

1 Answers

mfitzp

Recent Activity

Donate For Us