My work should use parallel techniques, and I a new user of python. So I wonder if you could share some material about the python multiprocessing
and subprocess
modules. What is the difference between these two?
A process is an instance of program (e.g. Jupyter notebook, Python interpreter). Processes spawn threads (sub-processes) to handle subtasks like reading keystrokes, loading HTML pages, saving files. Threads live inside processes and share the same memory space.
By formal definition, multithreading refers to the ability of a processor to execute multiple threads concurrently, where each thread runs a process. Whereas multiprocessing refers to the ability of a system to run multiple processors concurrently, where each processor can run one or more threads.
multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.
Subprocess in Python is a module used to run new codes and applications by creating new processes. It lets you start new applications right from the Python program you are currently writing. So, if you want to run external programs from a git repository or codes from C or C++ programs, you can use subprocess in Python.
The subprocess
module lets you run and control other programs. Anything you can start with the command line on the computer, can be run and controlled with this module. Use this to integrate external programs into your Python code.
The multiprocessing
module lets you divide tasks written in python over multiple processes to help improve performance. It provides an API very similar to the threading
module; it provides methods to share data across the processes it creates, and makes the task of managing multiple processes to run Python code (much) easier. In other words, multiprocessing
lets you take advantage of multiple processes to get your tasks done faster by executing code in parallel.
If you want to call an external program (especially one not written in Python) use subprocess
.
If you want to call a Python function in a subprocess, use multiprocessing
.
(If the program is written in Python, but is also importable, then I would try to call its functions using multiprocessing
, rather than calling it externally through subprocess
.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With