I am using a Pool to benefit of multiple cores. Each worker in the pool needs its own Calculator object. The initialization of calculator is quite time consuming, so I would like to have it generated only once per worker in the pool and not every time, a new task arrives. The only way, I got this working was by using the “ugly“ keyword global
. Is there a “cleaner” way to implement this?
I would like to avoid queues (parent thread is often sigkill’d and leaves child processes when using queues) and managers (performance too slow).
#!/usr/bin/python
# -*- coding: utf-8 -*-
import multiprocessing
def init_pool():
global calculator
calculator = Calculator() # should only executed ones per worker
def run_pool(args):
return calculator.calculate(*args) # time consuming calculation
class Organiser():
def __init__(self):
self.__pool = multiprocessing.Pool(initializer=init_pool)
def process(self, tasks):
results = self.__pool.map(run_pool, tasks)
return results
I don't see a way to achieve what you want (initialize exactly once per worker).
But the following seems to work if you want to initialize "Calculator" exactly once for the whole group of workers.
def run_pool(args):
calculator,arg = args
return calculator.calculate(arg) # time consuming calculation
class Organiser():
def __init__(self):
self.calculator = Calculator()
self.__pool = multiprocessing.Pool(processes=4)
def process(self, tasks):
results = self.__pool.map(run_pool, [(self.calculator,data) for data in tasks])
return results
To initialize exactly once per worker, it appears to me that you must use global variables or singletons (equivalent). I will await other answers to your question as well :)
Regards, Siddharth
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With