Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I tell if Python's Multiprocessing module is using all of my cores for calculations?

I have some simple code from a tutorial like this:

from multiprocessing import Process, Lock
import os

def f(i):
    print 'hello world', i
    print 'parent process:', os.getppid()
    print 'process id:', os.getpid(), "\n\n"

if __name__ == '__main__':
    lock = Lock()

    for num in range(10):
        p = Process(target=f, args=(num,))
        p.start()
    p.join()

How can I tell if this is utilising both of my cores? Currently I'm running Ubuntu 11.04 w/ 3 GB RAM and Intel Core 2 Duo @ 2.2GHz.

The project I'm learning this for is going to be moved to a huge machine in somebody's office, with much more horsepower than I currently have at my disposal. Specifically, the processor will have at least 4 cores, and I want to be sure to get my algorithm to automatically detect and utilise all available cores. Also, that system will potentially be something other than Linux, so are there any common pratfalls that I have to watch for when moving the Multiprocessing module between OS's?

Oh yeah, also, the output of the script looks something like this:

hello world 0
parent process: 29362
process id: 29363 


hello world 1
parent process: 29362
process id: 29364 


hello world 2
parent process: 29362
process id: 29365 

and so on...

So from what I know so far, the PPIDs are all the same because the script above when run is the parent process which calls the children processes, which are each a different process. So does the multiprocessing automatically detect and handle multiple cores, or do I have to tell it where to look? Also, from what I read while searching for a copy of this question, I shouldn't be spawning more processes than there are cores because it eats up the system resources that would otherwise be used for computations.

Thanks in advance for your help, my thesis loves you.

like image 879
user1173922 Avatar asked Oct 09 '22 04:10

user1173922


1 Answers

Here's a handy little command I use to monitor my cores from the command line:

watch -d "mpstat -P ALL 1 1 | head -n 12"

Note that the mpstat command must be available on your system, which you can get on Ubuntu by installing the sysstat package.

sudo apt-get install sysstat

If you want to detect the number of available cores from Python, you can do so using the multiprocessing.cpu_count() function. On Intel CPUs with Hyper-Threading, this number will be double the actual number of cores. Launching as many processes as you have available cores will usually scale to fully occupy all cores on your machine, as long as the processes have enough work to do and don't get bogged down with communication. Linux's process scheduler will take it from there.

like image 150
Brendan Wood Avatar answered Oct 12 '22 20:10

Brendan Wood