Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiprocessing on Python 3 Jupyter

I come here because I have an issue with my Jupiter's Python3 notebook. I need to create a function that uses the multiprocessing library. Before to implement it, I make some tests. I found a looooot of different examples but the issue is everytime the same : my code is executed but nothing happens in the notebook's interface :

enter image description here

The code i try to run on jupyter is this one :

import os

from multiprocessing import Process, current_process


def doubler(number):
    """
    A doubling function that can be used by a process
    """
    result = number * 2
    proc_name = current_process().name
    print('{0} doubled to {1} by: {2}'.format(
        number, result, proc_name))
    return result


if __name__ == '__main__':
    numbers = [5, 10, 15, 20, 25]
    procs = []
    proc = Process(target=doubler, args=(5,))

    for index, number in enumerate(numbers):
        proc = Process(target=doubler, args=(number,))
        proc2 = Process(target=doubler, args=(number,))
        procs.append(proc)
        procs.append(proc2)
        proc.start()
        proc2.start()

    proc = Process(target=doubler, name='Test', args=(2,))
    proc.start()
    procs.append(proc)

    for proc in procs:
        proc.join()

It's OK when I just run my code without Jupyter but with the command "python my_progrem.py" and I can see the logs : enter image description here

Is there, for my example, and in Jupyter, a way to catch the results of my two tasks (proc1 and proc2 which both call thefunction "doubler") in a variable/object that I could use after ? If "yes", how can I do it?

like image 345
Konate Malick Avatar asked Jun 19 '18 21:06

Konate Malick


People also ask

Can Jupyter notebook use multiple cores?

The maximum number of cores per session is 40 (number of core on a single compute node). The Python kernel running in Jupyter is a single-node process, it cannot utilize more than one computer (node). This is a general limitation of Jupyter, not just on our cluster.

Can we do multiprocessing in Python?

The multiprocessing package supports spawning processes. It refers to a function that loads and executes a new child processes. For the child to terminate or to continue executing concurrent computing,then the current process hasto wait using an API, which is similar to threading module.

Can I run two Jupyter notebooks at the same time?

You can run a Jupyter notebook from another Jupyter notebook that has the same type of kernel. For example, you can run a Jupyter notebook with Spark kernel from another Jupyter notebook with Spark kernel. From the left Sidebar, select and right-click on the Jupyter notebook that has to be run from another notebook.

How do you write a multiprocessing code in Python?

Python multiprocessing Process classAt first, we need to write a function, that will be run by the process. Then, we need to instantiate a process object. If we create a process object, nothing will happen until we tell it to start processing via start() function. Then, the process will run and return its result.


2 Answers

@Konate's answer really helped me. Here is a simplified version using multiprocessing.pool:

import multiprocessing

def double(a):
    return a * 2

def driver_func():
    PROCESSES = 4
    with multiprocessing.Pool(PROCESSES) as pool:
        params = [(1, ), (2, ), (3, ), (4, )]
        results = [pool.apply_async(double, p) for p in params]

        for r in results:
            print('\t', r.get())

enter image description here

like image 127
Kamen Tsvetkov Avatar answered Oct 03 '22 02:10

Kamen Tsvetkov


I succeed by using multiprocessing.pool. I was inspired by this approach :

def test():
    PROCESSES = 4
    print('Creating pool with %d processes\n' % PROCESSES)

with multiprocessing.Pool(PROCESSES) as pool:
    TASKS = [(mul, (i, 7)) for i in range(10)] + \
            [(plus, (i, 8)) for i in range(10)]

    results = [pool.apply_async(calculate, t) for t in TASKS]
    imap_it = pool.imap(calculatestar, TASKS)
    imap_unordered_it = pool.imap_unordered(calculatestar, TASKS)

    print('Ordered results using pool.apply_async():')
    for r in results:
        print('\t', r.get())
    print()

    print('Ordered results using pool.imap():')
    for x in imap_it:
        print('\t', x)

...etc For more, the code is at : https://docs.python.org/3.4/library/multiprocessing.html?

like image 44
Konate Malick Avatar answered Oct 03 '22 03:10

Konate Malick