Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IPython.parallel not using multicore?

I am experimenting with IPython.parallel and just want to launch several shell command on different engines.

I have the following Notebook:

Cell 0:

from IPython.parallel import Client
client = Client()
print len(client)
5

And launch the commands:

Cell 1:

%%px --targets 0 --noblock
!python server.py

Cell 2:

%%px --targets 1 --noblock
!python mincemeat.py 127.0.0.1

Cell 3:

%%px --targets 2 --noblock
!python mincemeat.py 127.0.0.1

What it does is it uses the mincemeat implementation of MapReduce. When I launch the first !python mincemeat.py 127.0.0.1 it uses roughly 100 % of one core, then when I launch the second it drops to 50 % each. I have 4 cores (+virtual cores) on the machine and can use them when launching directly from the terminal but not in the Notebook.

Is there something I am missing? I would like to use one core per !python mincemeat.py 127.0.0.1 command.

EDIT:
For clarity, here's another thing that's not using multiple cores:

Cell 1:

%%px --targets 0 --noblock

a = 0
for i in xrange(100000):
    for j in xrange(10000):
        a += 1

Cell 2:

%%px --targets 0 --noblock

a = 0
for i in xrange(100000):
    for j in xrange(10000):
        a += 1

I suppose I am missing something. I believe those two cells should run one different cores if available. However, it does not seem to be the case. Again the CPU usage shows that they share the same core and use 50 % of it. What did I do wrong?

like image 442
zermelozf Avatar asked May 01 '13 18:05

zermelozf


1 Answers

Summary of the chat discussion:

CPU affinity is a mechanism for pinning a process to a particular CPU core, and the issue here is that sometimes importing numpy can end up pinning Python processes to CPU 0, as a result of linking against particular BLAS libraries. You can unpin all of your engines by running this cell:

%%px
import os
import psutil
from multiprocessing import cpu_count

p = psutil.Process(os.getpid())
p.set_cpu_affinity(range(cpu_count()))
print p.get_cpu_affinity()

Which uses multiprocessing.cpu_count to get the number of CPUs, and then associates every engine with all CPUs.

An IPython notebook exploring the issue.

like image 59
minrk Avatar answered Oct 21 '22 05:10

minrk