Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing : More processes than cpu.count

Note: I "forayed" into the land of multiprocessing 2 days ago. So my understanding is very basic.

I am writing and application for uploads to amazon s3 buckets. In case the file size is larger(100mb), Ive implemented parallel uploads using pool from the multiprocessing module. I am using a machine with core i7 , i had a cpu_count of 8. I was under the impression that if i do pool = Pool(process = 6) I use 6 cores and the file begins to upload in parts and the uploads for the first 6 parts begins simultaneously. To see what happens when the process is greater than the cpu_count , i entered 20 (implying that i want to use 20 cores). To my surprise instead of getting a block of errors the program began to upload 20 parts simultaneously (I used a smaller chunk size to make sure there are plenty of parts). I dont understand this behavior. I have only 8 cores, so how cant he program accept an input of 20? When I say process=6, does it actually use 6 threads?? Which can be the only explanation of 20 being a valid input as there can be 1000s of threads. Can someone please explain this to me.

Edit:

I 'borrowed' the code from here. I have changed it only slightly wherein I ask the user for a core usage for his choice instead of setting parallel_processes to 4

like image 227
letsc Avatar asked Mar 17 '15 00:03

letsc


People also ask

Can you have more processes than cores?

You can use more processes than your machine has cores by specifying either a percent value greater than 100% or a number of processes greater than the number of cores on your machine. For example, if you have a 4-core machine, specifying 8 or 200% will spread operations over 8 processes.

How many processes can a CPU process at one time?

A single processor can run only one instruction at a time: it is impossible to run more programs at the same time. A program might need some resource, such as an input device, which has a large delay, or a program might start some slow operation, such as sending output to a printer.

How many cores does multiprocessing use?

Common research programming languages use only one processor The “multi” in multiprocessing refers to the multiple cores in a computer's central processing unit (CPU). Computers originally had only one CPU core or processor, which is the unit that makes all our mathematical calculations possible.

How many process a CPU can handle?

A single CPU handles one process at a time. But a "process" is a construct of an operating system; the OS calls playing a video in VLC a single process, but it's actually made up of lots of individual instructions. So it's not as if a CPU is tasked with playing a video and has to drop everything it was doing.


1 Answers

The number of processes running concurrently on your computer is not limited by the number of cores. In fact you probably have hundreds of programs running right now on your computer - each with its own process. To make it work the OS assigns one of your 8 processors to each process or thread only temporarily - at some point it may get stopped and another process will take its place. See What is the difference between concurrent programming and parallel programming? if you want to find out more.

Edit: Assigning more processes in your uploading example may or may not make sense. Reading from disk and sending over the network is normally a blocking operation in python. A process that waits for its chunk of data to be read or sent can be halted so that another process may start its IO. On the other hand, with too many processes either file I/O or network I/O will become a bottleneck and your program will slow down because of the additional overhead needed for process switching.

like image 113
Pyetras Avatar answered Sep 23 '22 17:09

Pyetras