Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I run multiprocessing Python programs on a single core machine?

So this is more or less a theoretical question. I have a single core machine which is supposedly powerful but nevertheless only one core. Now I have two choices to make :

  1. Multithreading: As far as my knowledge is concerned I cannot make use of multiple cores in my machines even if I had them because of GIL. Hence in this situation, it does not make any difference.

  2. Multiprocessing: This is where I have a doubt. Can I do multiprocessing on a single core machine? Or everytime I have to check the cores available in my machine and then run exactly the same or less number of processes?

Can someone please guide me on the relation between multiprocessing and cores in a machine.

I know this is a theoretical question but my concepts are not very clear on this.

like image 475
Subhayan Bhattacharya Avatar asked Sep 23 '18 10:09

Subhayan Bhattacharya


People also ask

Does multiprocessing require multiple cores?

Common research programming languages use only one processor The “multi” in multiprocessing refers to the multiple cores in a computer's central processing unit (CPU). Computers originally had only one CPU core or processor, which is the unit that makes all our mathematical calculations possible.

Does Python multiprocessing use multiple cores?

Key Takeaways Python is NOT a single-threaded language. Python processes typically use a single thread because of the GIL. Despite the GIL, libraries that perform computationally heavy tasks like numpy, scipy and pytorch utilise C-based implementations under the hood, allowing the use of multiple cores.

What happens if we apply multiprocessing on single core system explain?

In a multithreaded process on a single processor, the processor can switch execution resources between threads, resulting in concurrent execution. Concurrency indicates that more than one thread is making progress, but the threads are not actually running simultaneously.

Does Python multiprocessing work on Windows?

The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.


1 Answers

This is a big topic but here are some pointers.

  • Think of threads as processes that share the same address space and can access the same memory. Communication is done by shared variables. Multiple threads can run within the same process.
  • Processes (in this context, and roughly speaking) have their own private data and if two processes want to communicate that communication has to be done more explicitly.
  • When you are writing a program where the bottleneck is CPU cycles, neither threads or processes will give you a speedup on a single core machine.
  • Processes and threads are still useful for multitasking (rapid switching between (sub)programs) - this is what your operating system does because it runs far more processes than you have cores.
  • Processes and threads (or even coroutines!) can give you considerable speedup even on a single core machine if the tasks you are executing are I/O bound - think of fetching data from a network. For example, instead of actively waiting for data to be sent or to arrive, another process or thread can initiate the next network operation.
  • Threads are preferable over processes when you don't need explicit encapsulation due to their lower overhead. For most CPU-bound concurrent problems, and especially the large subset of "embarassingly parallel" ones, it does not make much sense to spawn more processes than you have processors.
  • The Python GIL prevents two threads in the same process from running in parallel, i.e. from multiple cores executing instructions literally at the same time.
  • Therefore threads in Python are relatively useless for speeding up CPU-bound tasks, but can still be very useful for I/O bound tasks, because blocking operations (e.g. waiting for network data) release the GIL such that another thread can run while the other waits.
  • If you have multiple processors, you can have true parallelism by spawning multiple processes despite the GIL. This is only worth it for CPU bound tasks, and often you have to consider the overhead of spawning processes and the communication cost between processes.
like image 197
timgeb Avatar answered Oct 17 '22 01:10

timgeb