Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to start Pool processes sequentially?

The following code starts three processes, they are in a pool to handle 20 worker calls:

import multiprocessing

def worker(nr):
    print(nr)

numbers = [i for i in range(20)]

if __name__ == '__main__':
    multiprocessing.freeze_support()
    pool = multiprocessing.Pool(processes=3)
    results = pool.map(worker, numbers)
    pool.close()
    pool.join()

Is there a way to start the processes in a sequence (as opposed to having them starting all at the same time), with a delay inserted between each process start?

If not using a Pool I would have used multiprocessing.Process(target=worker, args=(nr,)).start() in a loop, starting them one after the other and inserting the delay as needed. I find Pool to be extremely useful, though (together with the map call) so I would be glad to keep it if possible.

like image 825
WoJ Avatar asked Sep 26 '22 18:09

WoJ


1 Answers

According to the documentation, no such control over pooled processes exists. You could however, simulate it with a lock:

import multiprocessing
import time

lock = multiprocessing.Lock()

def worker(nr):
    lock.acquire()
    time.sleep(0.100)
    lock.release()
    print(nr)

numbers = [i for i in range(20)]

if __name__ == '__main__':
    multiprocessing.freeze_support()
    pool = multiprocessing.Pool(processes=3)
    results = pool.map(worker, numbers)
    pool.close()
    pool.join()

Your 3 processes will still start simultaneously. Well, what I mean is you don't have control over which process starts executing the callback first. But at least you get your delay. This effectively has each worker "starting" (but really, continuing) at designated intervals.

Ammendment resulting from discussion below:

Note that on Windows it's not possible to inherit a lock from a parent process. Instead, you can use multiprocessing.Manager().Lock() to communicate a global lock object between processes (with additional IPC overhead, of course). The global lock object needs to be initialized in each process, as well. This would look like:

from multiprocessing import Process, freeze_support
import multiprocessing
import time
from datetime import datetime as dt

def worker(nr):
    glock.acquire()
    print('started job: {} at {}'.format(nr, dt.now()))
    time.sleep(1)
    glock.release()
    print('ended   job: {} at {}'.format(nr, dt.now()))

numbers = [i for i in range(6)]

def init(lock):
    global glock
    glock = lock

if __name__ == '__main__':
    multiprocessing.freeze_support()
    lock = multiprocessing.Manager().Lock()
    pool = multiprocessing.Pool(processes=3, initializer=init, initargs=(lock,))
    results = pool.map(worker, numbers)
    pool.close()
    pool.join()
like image 194
Velimir Mlaker Avatar answered Sep 30 '22 06:09

Velimir Mlaker