Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

logging while using parallel python

I'm using parallel python to execute a big function (executePipeline) multiple times. This function is also using multiprocessing (with the multiprocessing module).
I'm having some trouble to display correctly the logging messages on my console using the parallel python module. When I'm not using it, the logging messages are well displayed.

Here is how it works. I have a server who is calling a worker every time it gets a request from client using:

job = self.server.job_server.submit(func = executeWorker, args = (config, ) )

This function is execute from a new thread every time there is a new request from a client. Then the worker is calling the function executePipeline which is executing different processus using multiprocessing.

The server a SocketServer.TCPServer I using threading. I set up a logger in my server as following using the root logger:

self.logger = logging.getLogger()
self.logger.setLevel(logging.INFO)
self.logger.addHandler(logging.StreamHandler() 
self.job_server = pp.Server(ncpus = 8) # for test
self.jobs = []

When I'm running my server i can only get the logging from the executePipeline but not from the child processes. Also I get the logging of the execute pipeline only at the end of the job not while is running.

Also here is the worker code. The "Executing pipeline with worker number " is well displayed in my terminal

'''
Setup logging
'''

logger = logging.getLogger()
logger.setLevel(logging.INFO)  

# worker name
publicIP = socket.gethostbyname(socket.gethostname()) 
pid = os.getpid()
workerID = unicode(str(publicIP) + ":" + str(pid))

logger.info( "Executing pipeline with worker {}".format(workerID))
res = executePipeline(config)    
markedScore = res["marker.score"]
markedDetails = res["marker.detail"]
results = {'marker.detail' : markedDetails , 'marker.score' : markedScore } 

return results

Is there a good way too get the logging working properly and see what the child process of my executePipeline function are sending back?

Thanks for your help!

Romanzo

like image 853
Romanzo Criminale Avatar asked Jun 05 '13 07:06

Romanzo Criminale


1 Answers

I had a similar problem when I tried to write paralellised tests which write results to a shared dictionary. The multiprocessing.Manager was the answer:

# create shared results dictionary
manager = multiprocessing.Manager()
result_dict = manager.dict({})

so you can simply post the logs from processes to that common dictionary and then process them.

or use LOG = multiprocessing.get_logger() as explained here: https://docs.python.org/2/library/multiprocessing.html and here: How should I log while using multiprocessing in Python?

like image 159
vladosaurus Avatar answered Oct 12 '22 09:10

vladosaurus