Python Global Interpreter Lock (GIL) workaround on multi-core systems using taskset on Linux?

Tags:

So I just finished watching this talk on the Python Global Interpreter Lock (GIL) http://blip.tv/file/2232410.

The gist of it is that the GIL is a pretty good design for single core systems (Python essentially leaves the thread handling/scheduling up to the operating system). But that this can seriously backfire on multi-core systems and you end up with IO intensive threads being heavily blocked by CPU intensive threads, the expense of context switching, the ctrl-C problem[*] and so on.

So since the GIL limits us to basically executing a Python program on one CPU my thought is why not accept this and simply use taskset on Linux to set the affinity of the program to a certain core/cpu on the system (especially in a situation with multiple Python apps running on a multi-core system)?

So ultimately my question is this: has anyone tried using taskset on Linux with Python applications (especially when running multiple applications on a Linux system so that multiple cores can be used with one or two Python applications bound to a specific core) and if so what were the results? is it worth doing? Does it make things worse for certain workloads? I plan to do this and test it out (basically see if the program takes more or less time to run) but would love to hear from others as to your experiences.

Addition: David Beazley (the guy giving the talk in the linked video) pointed out that some C/C++ extensions manually release the GIL lock and if these extensions are optimized for multi-core (i.e. scientific or numeric data analysis/etc.) then rather than getting the benefits of multi-core for number crunching the extension would be effectively crippled in that it is limited to a single core (thus potentially slowing your program down significantly). On the other hand if you aren't using extensions such as this

The reason I am not using the multiprocessing module is that (in this case) part of the program is heavily network I/O bound (HTTP requests) so having a pool of worker threads is a GREAT way to squeeze performance out of a box since a thread fires off an HTTP request and then since it's waiting on I/O gives up the GIL and another thread can do it's thing, so that part of the program can easily run 100+ threads without hurting the CPU much and let me actually use the network bandwidth that is available. As for stackless Python/etc I'm not overly interested in rewriting the program or replacing my Python stack (availability would also be a concern).

[*] Only the main thread can receive signals so if you send a ctrl-C the Python interpreter basically tries to get the main thread to run so it can handle the signal, but since it doesn't directly control which thread is run (this is left to the operating system) it basically tells the OS to keep switching threads until it eventually hits the main thread (which if you are unlucky may take a while).

737

asked Jun 13 '09 06:06

Kurt

2 Answers

Another solution is: http://docs.python.org/library/multiprocessing.html

Note 1: This is not a limitation of the Python language, but of CPython implementation.

Note 2: With regard to affinity, your OS shouldn't have a problem doing that itself.

102

answered Oct 07 '22 04:10

ynimous

I have never heard of anyone using taskset for a performance gain with Python. Doesn't mean it can't happen in your case, but definitely publish your results so others can critique your benchmarking methods and provide validation.

Personally though, I would decouple your I/O threads from the CPU bound threads using a message queue. That way your front end is now completely network I/O bound (some with HTTP interface, some with message queue interface) and ideal for your threading situation. Then the CPU intense processes can either use multiprocessing or just be individual processes waiting for work to arrive on the message queue.

In the longer term you might also want to consider replacing your threaded I/O front-end with Twisted or some thing like eventlets because, even if they won't help performance they should improve scalability. Your back-end is now already scalable because you can run your message queue over any number of machines+cpus as needed.

answered Oct 07 '22 03:10

Van Gale

Related questions
                            
                                Can PyPy/RPython be used to produce a small standalone executable?
                            
                                How to construct a dictionary from two dictionaries in python? [duplicate]
                            
                                Python string.format() percentage without rounding
                            
                                Reading Excel file is magnitudes slower using openpyxl compared to xlrd
                            
                                How to embed a Python interpreter in a PyQT widget
                            
                                How do I run unittest on a Tkinter app?
                            
                                Passing and returning numpy arrays to C++ methods via Cython
                            
                                How to list all exceptions a function could raise in Python 3?
                            
                                Type hint that a function never returns
                            
                                Why is python decode replacing more than the invalid bytes from an encoded string?
                            
                                In Python, how do I know when a process is finished?
                            
                                How can a piece of python code tell if it's running under unittest
                            
                                Sort a numpy array by another array, along a particular axis
                            
                                redis-py with gevent
                            
                                pip fails to install packages from requirements.txt
                            
                                should pytest et al. go in tests_require[] or extras_require{testing[]}?
                            
                                Importing an svg file into a matplotlib figure
                            
                                How to determine appropriate strftime format from a date string?
                            
                                How can I get stub files for `matplotlib`, `numpy`, `scipy`, `pandas`, etc.?
                            
                                How do I make a menu that does not require the user to press [enter] to make a selection?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Global Interpreter Lock (GIL) workaround on multi-core systems using taskset on Linux?

Tags:

python

multithreading

multicore

gil

python-stackless

Kurt

People also ask

2 Answers

ynimous

Van Gale

Recent Activity

Donate For Us