Is it true that in multiprocessing, each process gets it's own GIL in CPython? How different is that from creating new runtimes?

Tags:

Are there any caveats to it? I have a few questions related to it.

How costly is it to create more GILs? Is it any different from creating a separate python runtime? Once a new GIL is created, will it create everything (objects, variables, stack, heap) from scratch as required in that process or a copy of everything in the present heap and the stack is created? (Garbage collection would malfunction if they are working on same objects.) Are the pieces of code being executed also copied to new CPU cores? Also can i relate one GIL to one CPU core?

Now copying things is a fairly CPU intensive task (correct me if I am wrong), what would be the threshold to decide whether to go for multiprocessing?

PS: I am talking about CPython but please feel free to extend the answer to whatever you feel is necessary.

585

asked Feb 15 '20 07:02

sprksh

1 Answers

Looking back at this question after 6 months, I feel I can clarify the doubts of my younger self. I hope this would be helpful to people who stumble upon it.

Yes, It is true that in multiprocessing module, each process has a separate GIL and there are no caveats to it. But the understanding of the runtime and GIL is flawed in the question which needs to be corrected.

I will clear the doubts/ answer the questions with a series of statements.

Python code is ran (compiled to Cpython bytecode and then this bytecode interpreted) by CPython virtual machine. This is what constitutes the python runtime.
When we create a new process, an entire new python virtual machine is launched (which we call the python process) with the stack and the heap memory.
Yes this is a costly process but not too costly. Because python virtual machine is piece of C code precompiled to machine code. To put in perspective, the reason that in java they do not use multiprocessing is that it will create multiple JVMs which would be terrible as JVM needs a lot of memory and also, JVM is not precompiled machine code like CPython.
GIL is just a piece of code within the python virtual machine which lets the CPython interpreter execute only one line of CPython bytecode (or one instruction) at a time. So, all questions related to GIL creation and cost are dumb. Basically the intention was to ask about CPython Virtual Machine.
Can I relate 1 GIL to 1 CPU core? : Better to ask if 1 Python process can be related to 1 CPU core? : No. That's Kernel's job to decide what core the process is running (and which will keep changing from time to time and the process would have no control over it). The only thing is that at any give point of time, one python process cannot be running on multiple cores and one python process will execute only one instruction in CPython bytecode (due to the GIL).

What's copied in cores and how the OS tries to keep a process hold the Core it is working on is a separate ans very deep topic in itself.

The final question is a subjective one but with all this understanding, it's basically a cost to benefit ratio that may vary from program to program and might depend on how CPU intensive a process is and how many cores does the machine has etc. So that cannot be generalised.

158

answered Sep 21 '22 06:09

sprksh

Related questions
                            
                                How to use time as x axis for a scatterplot with seaborn?
                            
                                How to use tweening in Python, without losing accuracy?
                            
                                How to convert current datetime into 13 digits Unix timestamp? [duplicate]
                            
                                How to reference static method from class variable [duplicate]
                            
                                Permutations of a list with 16 integers but only if 4 conditions are fulfilled
                            
                                How can I rotate a matplotlib map?
                            
                                How to get the mode of distribution in scipy.stats
                            
                                What's the difference between auto_remove and remove in Docker SDK for python
                            
                                Why are deep learning libraries so huge?
                            
                                How to use nox with poetry?
                            
                                Split a list of dates into subsets of consecutive dates
                            
                                Visual Studio Code syntax highlighting not working
                            
                                Reading .dat file in python
                            
                                Feeding nullable data from BigQuery into Tensorflow Transform
                            
                                Does the django_address module provide a way to seed the initial country data?
                            
                                How to generate asgi.py for existent project?
                            
                                How do I correctly use mock call_args with Python's unittest.mock?
                            
                                Flask endpoint vs Sagemaker endpoint
                            
                                which python vs PYTHONPATH
                            
                                Do I need to split the data for isolation forest?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it true that in multiprocessing, each process gets it's own GIL in CPython? How different is that from creating new runtimes?

Tags:

python

multiprocessing

cpython

cpu

gil

sprksh

People also ask

1 Answers

sprksh

Recent Activity

Donate For Us