Are there any caveats to it? I have a few questions related to it.
How costly is it to create more GILs? Is it any different from creating a separate python runtime? Once a new GIL is created, will it create everything (objects, variables, stack, heap) from scratch as required in that process or a copy of everything in the present heap and the stack is created? (Garbage collection would malfunction if they are working on same objects.) Are the pieces of code being executed also copied to new CPU cores? Also can i relate one GIL to one CPU core?
Now copying things is a fairly CPU intensive task (correct me if I am wrong), what would be the threshold to decide whether to go for multiprocessing?
PS: I am talking about CPython but please feel free to extend the answer to whatever you feel is necessary.
In CPython, the Global Interpreter Lock (GIL) is a mutex that allows only one thread at a time to have the control of the Python interpreter. In other words, the lock ensures that only one thread is running at any given time. Therefore, it is impossible to take advantage of multiple processors with threads.
The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine.
The GIL provides an important simplifying model of object access (including refcount manipulation) because it ensures that only one thread of execution can mutate Python objects at a time5. There are important performance benefits of the GIL for single-threaded operations as well.
This is achieved by preventing threads to use the Python interpreter simultaneously while they run. Use threaded extensions in C where GIL is not a problem (Numexpr, NumPy with MKL, SciPy with FFTW...): Pro: powerful and very easy to use.
Looking back at this question after 6 months, I feel I can clarify the doubts of my younger self. I hope this would be helpful to people who stumble upon it.
Yes, It is true that in multiprocessing module, each process has a separate GIL and there are no caveats to it. But the understanding of the runtime and GIL is flawed in the question which needs to be corrected.
I will clear the doubts/ answer the questions with a series of statements.
What's copied in cores and how the OS tries to keep a process hold the Core it is working on is a separate ans very deep topic in itself.
The final question is a subjective one but with all this understanding, it's basically a cost to benefit ratio that may vary from program to program and might depend on how CPU intensive a process is and how many cores does the machine has etc. So that cannot be generalised.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With