I'm using parallel python to execute a big function (executePipeline
) multiple times. This function is also using multiprocessing (with the multiprocessing
module).
I'm having some trouble to display correctly the logging messages on my console using the parallel python module. When I'm not using it, the logging messages are well displayed.
Here is how it works. I have a server who is calling a worker every time it gets a request from client using:
job = self.server.job_server.submit(func = executeWorker, args = (config, ) )
This function is execute from a new thread every time there is a new request from a client.
Then the worker is calling the function executePipeline
which is executing different processus using multiprocessing.
The server a SocketServer.TCPServer
I using threading. I set up a logger in my server as following using the root logger:
self.logger = logging.getLogger()
self.logger.setLevel(logging.INFO)
self.logger.addHandler(logging.StreamHandler()
self.job_server = pp.Server(ncpus = 8) # for test
self.jobs = []
When I'm running my server i can only get the logging from the executePipeline
but not from the child processes.
Also I get the logging of the execute pipeline only at the end of the job not while is running.
Also here is the worker code. The "Executing pipeline with worker number "
is well displayed in my terminal
'''
Setup logging
'''
logger = logging.getLogger()
logger.setLevel(logging.INFO)
# worker name
publicIP = socket.gethostbyname(socket.gethostname())
pid = os.getpid()
workerID = unicode(str(publicIP) + ":" + str(pid))
logger.info( "Executing pipeline with worker {}".format(workerID))
res = executePipeline(config)
markedScore = res["marker.score"]
markedDetails = res["marker.detail"]
results = {'marker.detail' : markedDetails , 'marker.score' : markedScore }
return results
Is there a good way too get the logging working properly and see what the child process of my executePipeline
function are sending back?
Thanks for your help!
Romanzo
I had a similar problem when I tried to write paralellised tests which write results to a shared dictionary. The multiprocessing.Manager was the answer:
# create shared results dictionary
manager = multiprocessing.Manager()
result_dict = manager.dict({})
so you can simply post the logs from processes to that common dictionary and then process them.
or use LOG = multiprocessing.get_logger()
as explained here: https://docs.python.org/2/library/multiprocessing.html
and here: How should I log while using multiprocessing in Python?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With